r/ChatGPT 25d ago

Ai detectors suck Other

Post image

Me and my Tutor worked on the whole essay and my teacher also helped me with it. I never even used AI. All of my friends and this class all used AI and guess what I’m the only one who got a zero. I just put my essay into multiple detectors and four out of five say 90% + human and the other one says 90% AI.

4.5k Upvotes

View all comments

633

u/DjawnBrowne 24d ago

HS teacher here — these detectors DO NOT WORK to the degree that a teacher can hold you to the results of them, go above this teacher’s head ASAP, they will NOT win this battle with admin.

42

u/DADNutz 24d ago

HS teacher here as well

If you monitor them throughout the writing process, you won’t need an AI generator.

How, you ask?

Discuss the topic… Help craft an outline with the class…. Monitor their rough drafts…. Revise the rough draft with them…

By time final draft due date comes, have them make a portfolio with their notes from the discussion, their outline, their rough drafts w/ revisions.

They don’t need AI and you don’t need a AI detector.

Everyone wins.

14

u/TheGalator 24d ago

No no no you suddenly drop it on them at the end of class on Friday to do till Monday all on their own and then use ai so you have to work less while making kids come to school earlier because your ai told you they used ai

Rookie mistake

-101

u/ZerooGravityOfficial 24d ago

GPT lover here - why are you just taking the student's words at face value?

28

u/mjm65 24d ago

I think it’s the opposite. I’m not taking “84% of this paper is AI generated” as the truth because I can almost guarantee no one can breakdown how it’s 84% AI.

4

u/nn123654 24d ago

That's not what it means though, it's simply a Neural Net that's set to solve a binary classification problem of AI or not AI.

When they say it's 84% AI it means the model is 84% confident the paper was written using AI, not that the paper consists of 84% of the words which were written by AI and 16% it was written by a human.

In reality these confidence intervals are set by network training and not by statistics. Typically you'll set what's known as an activation threshold beyond which you'll consider it in one category or the other.

See this for an example.

3

u/mjm65 24d ago

the model is 84% confident the paper was written using AI,

How is it 84 instead of 83, 85 or 20?

Was the model setup and calibrated correctly? Does the inclusion of 1 or 2 words massively increase the confidence score?

Because I would imagine they wouldn't know either.

-1

u/nn123654 24d ago edited 24d ago

How is it 84 instead of 83, 85 or 20?

It's largely arbitrary, in fact at first it's basically totally random.

Literally what happens is a bunch of data goes in and you tweak the confidence to get better performance over time. Training and gradient decent are beyond the scope of what can easily be described in a reddit comment, but there are loads of videos on youtube which describe them.

The high level summary is you have different layers of data. The first layer of data is all your input into the network, say every word in the essay, the middle layers are your connections, basically your "brain" if you can call it that, and the last layer is the output layer.

Each "neuron" or node is connected to every node in each layer beyond it. Those "neurons" have a bias, called an edge weight that contains the saved data "learning/training" of the network.

What happens is you take the input, convert it to a number, multiply it by all the edge weights, add them all together, rectify them to be a value between 0 and 1, and then feed that into the middle later. If it's above the activation preference (the threshold for the nueron, usually but not always 0.5 or 50%) it's "fired" and if not it's "not fired." From there you do this thousands to millions more times until you get to your output.

When you train you either get it a reward and tell it to optimize itself towards that (unsupervised learning) or you give it a bunch of examples of both AI cheating and not AI cheating, training data, and run it against that dataset (supervised learning).

Each time the model is wrong you adjust some edge weights until you get the right answer. Do that a whole bunch of times and you end up with something that's right more often than it's wrong, or right a lot.

So to answer your question it's 84 because on that version of the model, that's what came out based on the edge weights that went in and training data it was trained on.

Was the model setup and calibrated correctly? 

lololol, setup? lol no dawg. That's not a thing with AI. You can train or tweak it sure, but it's a black box. The AI itself sets the edge weights in the model, and therefore the AI calibrates itself over time.

All you can do is look at the algorithm, look at the training data or reward, and then see what you can do to tweak it. See CGP Gray's non-technical video for a high level understanding of this topic: https://www.youtube.com/watch?v=R9OHn5ZF4Uo

There will always be error, it's just how the technology works. Hopefully it's relatively low enough that you just ignore it as acceptable error.

 Does the inclusion of 1 or 2 words massively increase the confidence score?

Possibly, you don't tell the AI what words to weight, it decides that on it's own. Depends on what kind of connections or assumptions it learned in training.

Often AI will find really stupid ways of solving stuff that somehow work, like with handwriting recognition instead of learning how to read letters like a human a lot of times it will just cheat and look at say the top or bottom only of the letter. See this video for some examples: https://www.youtube.com/watch?v=TkwXa7Cvfr8

It's absolutely possible that it could. It can learn the wrong thing and find a local minimum in gradient descent (assuming you are using that algorithm), that is not in fact the global best solution. This is called overfitting because it gets too comfortable with either the training data or what it's used to.

Because I would imagine they wouldn't know either.

They don't, that's the problem with AI instead of statistic or mathmatical models. Not only are there not mathematical proofs you can use to make sure you have a consistent answer, but the AI models are designed to constantly change. You could literally get a different answer one day to the next if they update it with new training data.

That's not to say that every problem can be solved with a statistical model, becuase often it can't. Satistical models are great at finding outliers and understanding relationships in data for instance, but struggle to find new ways of doing stuff. They are far less adaptable, because it requires for something to become common enough to change the dataset. For something like writing they simply can't capture all the different ways in which AI could be used.

For instance stastical hurricane prediction models perform terribly, mainly because they try to find a polynomial trend based on past storms and then use that to generate a polynomial of best fit to predict it as a function. Numerical prediction models based on differential equations interacting with different parts of the atmosphere perform much better but require far more computation power. Neither uses AI, which would be both less predictable and perform worse.

AI is a function approximator, so if you know the calculation for how to get the function and have the math to prove it, that will always be a better option than AI. What AI is really good for is when you don't or can't get those equations, or have a rapidly changing dataset.

See:

* Back Propagation: https://www.ibm.com/think/topics/backpropagation

* Gradient Descent: https://www.ibm.com/topics/gradient-descent

1

u/Tricky_Garbage5572 24d ago

Well you can calibrate temperature

1

u/nn123654 23d ago edited 23d ago

Yeah, that's definitely a thing, see: https://docs.aws.amazon.com/prescriptive-guidance/latest/ml-quantifying-uncertainty/temp-scaling.html

This is more of a change to how the softmax function at the end squishes the data to get it between 0 and 1 than anything else. It applies a bias based on past model performance to predict the possibility of an output for a certain probability and adjust it up or down.

The most common temperature scaling method is platt scaling. It's a post processing method that basically runs the model's output through a statistical model. It doesn't change the actual model, but it may change the implementation of the model's final output.

23

u/bobacookiekitten 24d ago

You can say the same about technology these teachers do not understand. Atleast they understand their own students, and it they do not they can take in their past history and particular patterns: which this technology cannot. ‘AI Detectors’ should not be taken at face value.

-27

u/ZerooGravityOfficial 24d ago

i know AI detectors are buggy - I also know I can snap pick off a comment that's 'pure GPT'

1

u/Sip_py 24d ago

But what if you have Gemini rewrite GPT's?

1

u/bobacookiekitten 24d ago

Comments and students are two very different things. I’ve sometimes have been told I speak like chatgpt.

5

u/DjawnBrowne 24d ago

Fellow GPT lover here. I’ve been using since the beginning (second year teacher in my early 30s).

It’s not about if the student is lying or not — it’s that the tools aren’t anywhere near good or reliable enough to prove that they are. It’s wildly unethical to use a tool so (frankly) shitty to try to hold anyone accountable for anything, and unless the teacher that wrote this response in OP’s portal has been living under a rock for the past two years, they 1000% know that.

Elsewhere in this thread, someone pointed out that the constitution comes back as 96% AI generated. I highly doubt the list of stinky old bastards that crammed themselves into Liberty Hall three hundred years ago included Claude.

So I’ll say it here openly — using AI is not technical plagiarism, we have to be able to prove plagiarism. We cannot prove anyone has used artificial intelligence to supplement their writing (or whatever else) because a reliable tool to do so doesn’t exist.

2

u/Feelisoffical 24d ago

The tools work quite well actually. Generally when someone swears they don’t work at all they just got caught cheating.

If someone turns in the Declaration of Independence as their own work it makes sense an AI detector would call it AI, although technically it would be plagiarism.

2

u/ZerooGravityOfficial 22d ago

the OP literally said "all my friends were cheating, except me"..

sure bro

2

u/Feelisoffical 22d ago

Exactly, these types of posts are generally just copes.

2

u/prf_q 24d ago

You must be one of those who bought NFTs.

1

u/RaoD_Guitar 24d ago

Doesn't even matter if OP is honest or not. The teacher just can't use a method that's super inconsistent to invalidate any of his students work.