โ† Back to Help Center

How the AI Works

It's just math, not magic. Here's what's actually happening.

Getting Started Fast Scanning Setup How AI Works Understanding Confidence

The Simple Truth About "AI"

When people hear "AI" and "machine learning," they often imagine something mysterious or impossibly complex. Here's the truth: it's just math. Very straightforward math, applied at scale.

๐ŸŽฏ The Core Idea: Linear Regression

Remember "draw a line through the dots" from math class? That's essentially what machine learning does - just with millions of dots and in many dimensions instead of two.

When you learned to draw a "line of best fit" through scattered points on a graph, you were doing machine learning. The only difference is scale:

Same concept. Different scale. That's it.

What Actually Happens When You Scan a Photo

Step 1: Image โ†’ Numbers (Vectorization)

Your photo is just a grid of colored pixels. Each pixel has three numbers (red, green, blue values from 0-255). A 640ร—640 photo becomes:

640 ร— 640 ร— 3 = 1,228,800 numbers

That's your photo as a vector - just a long list of numbers.

Your Photo
๐Ÿ“ท Image
โ†’
As Numbers
[0.2, 0.8, 0.1, ...]

Step 2: Numbers ร— Weights = New Numbers (Matrix Multiplication)

The AI model is essentially a giant table of numbers (called "weights") that were learned during training. We multiply your photo's numbers by these weights:

Your Photo Vector ร— Model Weights = Result Vector [1.2M numbers] ร— [weights matrix] = [new numbers]

This is just multiplication and addition - the same operations you learned in elementary school, just done millions of times very fast.

๐Ÿงฎ Think of it like a recipe

If a cake recipe says "2 cups flour + 1 cup sugar + 3 eggs", you're multiplying quantities by weights and adding them up. Neural networks do the same thing: multiply inputs by learned weights, add them up, repeat.

Step 3: Layer After Layer (Deep Learning)

We repeat this process through multiple "layers":

Input
Photo
โ†’
Layer 1
Edges
โ†’
Layer 2
Shapes
โ†’
Layer 3+
Objects
โ†’
Output
Detection

Each layer extracts more abstract features. Early layers detect simple edges and colors. Later layers recognize complex patterns and objects.

Step 4: Read the Output (Classification)

The final layer produces numbers that represent confidence scores for each category the model knows about. Higher numbers = more confident.

Output: { "sensitive_content": 0.87, โ† 87% confident "safe_content": 0.13 โ† 13% confident }

Why This Works

During training, the model saw millions of labeled photos:

After seeing enough examples, the weights converge to values that generalize to new photos. It's pattern recognition through statistics.

Key insight: The model doesn't "understand" photos the way humans do. It learned statistical patterns: "when I see these pixel patterns, the answer is usually X." It's sophisticated pattern matching, not comprehension.

Why On-Device Matters

All of this math happens on your iPhone's Neural Engine - specialized hardware designed for exactly these matrix multiplications. This means:

๐Ÿ”’ Privacy by Design

We can't see your photos because they literally never leave your phone. The math happens entirely on your device. We only ship you the weights (the ~40MB model file) - your photos multiply against those weights locally.

The Bottom Line

Machine learning sounds fancy, but it's fundamentally:

  1. Convert your photo to numbers
  2. Multiply by learned weights (lots of them)
  3. Add up the results
  4. Repeat through multiple layers
  5. Read the final confidence scores

That's it. Linear algebra at scale. The "intelligence" comes from the weights, which were learned by seeing millions of examples during training.

Now that you understand how the detection works, learn about what the confidence scores mean โ†’