Set up your team's project: event taxonomy, experiment backlog, metric definitions, and past results
A Juma Project is a shared space where the team stores everything Juma needs to know about how the team experiments. Create one project for the product, add context as the team learns more, and Juma uses what's relevant every time the flow runs. For experimentation, this is what keeps tests grounded in the team's actual events and decisions, instead of a generic best-practice design.
What to add
Product & Event Taxonomy
What the product's key events are and what they mean: which event counts as "activation," which as "conversion," which as "retained." With this in the project, Juma maps metrics to the right events on the first pass. Without it, Juma reads the event list from Mixpanel and asks the team to confirm which event each metric should use.
Metric Definitions & Guardrails
The team's canonical primary metrics and the guardrails that must not regress on any test: retention, revenue per user, support volume. Juma applies these to every experiment automatically, so no test ships on a conversion win that quietly hurt a metric the team protects.
Experiment Backlog
The running list of hypotheses the team wants to test and why. Juma reads it to design the next test in context, reference related past tests, and avoid re-running an experiment that already has an answer. The backlog turns one-off tests into a program.
Past Experiment Log
What the team has already tested, the result, and the decision. Juma references this so a new test builds on what was learned, and so "did we already try this?" has an answer. Each readout the flow produces can append to this log automatically.
Guide Juma with project info
Add a short description to each knowledge item in the project's info field so Juma knows what each file contains and when to use it. For example:
- Product & Event Taxonomy: "Use to map metrics to events. 'Activation' = completed_setup event within 7 days of signup."
- Metric Definitions & Guardrails: "Apply these guardrails to every experiment. Flag any regression in the readout."
- Experiment Backlog: "Reference when designing a new test. Check for related or duplicate hypotheses first."
- Past Experiment Log: "Reference for what we've learned. Append each new readout here."
Run your next experiment end to end
Frequently Asked Questions
What does Juma need to set up a Mixpanel A/B experiment?
A connected Mixpanel project and a one-line description of what the team wants to test. Juma reads the project's events and business context, proposes a primary metric mapped to a real event, suggests guardrail metrics, and sizes the test. The team confirms the design before anything is created.
If the team names the metric and the lift it cares about, Juma uses them directly. If not, Juma proposes both from the events already in Mixpanel and asks the team to confirm. No exports or manual configuration are needed; the flow runs against the live project.
Does Juma launch the experiment, or does the team?
The team launches. Juma drafts the experiment, its metrics, and its variants in Mixpanel as a reviewable draft, then stops. A person reviews the design and clicks launch in Mixpanel. Juma asks for explicit confirmation before any write or delete action, so nothing is created or changed silently.
This split is deliberate. Designing the test and doing the statistics is repeatable work the flow handles well. Deciding to put a change in front of real users is a judgment call that stays with the team. Human review on every output.
How does Juma decide ship, iterate, or kill?
Juma pulls the experiment's results from Mixpanel and applies Mixpanel's own interpretation guidance: the lift on the primary metric, whether it cleared statistical significance, how each guardrail metric moved, and whether the test reached its sample size. It returns a recommendation with the reasoning shown, not just a verdict.
When a result is inconclusive, Juma says so and names what would make it callable, more runtime, a larger sample, a cleaner metric, rather than forcing a winner. The recommendation is evidence the team can audit. Strategy, taste, and judgment stay human; Juma makes the data legible enough to decide on.
Can Juma read out an experiment we already ran?
Yes. Point Juma at an experiment that is already live or concluded and ask for the results. It finds the experiment by name, pulls the data, and produces the same ship/iterate/kill readout without re-running the design step. This is the fastest entry point if the team already has tests in Mixpanel.
Juma can also list every experiment in the project and flag which ones are ready to call, so a backlog of open tests gets cleared instead of sitting past its end date.
How is this different from Mixpanel's experiment reports or a data analyst?
Mixpanel shows the numbers; it does not design the test for you, size it, or tell you what to do with the result. Juma does the design, the sample-size math, the guardrail check, and the ship/iterate/kill readout in chat, then hands the decision to the team. It augments the analyst, it does not replace one.
For teams without a dedicated analyst, the flow covers the parts that usually get skipped: a sound design and an honest read of significance. For teams with an analyst, it removes the repetitive setup and reporting so the analyst spends time on the questions that need real judgment.