MAPL - ESG Question Matcher

Step 1: Upload Files & Configure Structure

Database Excel * Excel file with column headers No file selected Questions Excel * Excel file with questions to match No file selected Learning Rules Optional: text file with matching rules No file selected

Database File Structure

Sheet Name Total Columns Names Row Descriptions Row Units Row Examples Row

Questions File Structure

Sheet Name Orientation

Start Row Row ID Column Metric Column Question Column Correct Answers Column Column letter with expected answers for accuracy validation

Step 2: Validation & Configuration

Validating files...

Database Columns Preview

Letter	Name	Description	Unit

Questions Preview

#	Metric	Question

Correct Answers Preview

#	Metric	Expected Answer

Pipeline Configuration

Configure how the matching pipeline processes questions. Expand each stage to learn what it does and tune its parameters.

0 Learning Normalization Process learning rules file

Processes your learning.txt file to normalize rules, detect conflicts, and optimize for matching. Runs once at startup before processing questions.

Enable normalization Cache normalized content Max Tokens Maximum output tokens for normalized content Conflict Strategy How to handle conflicting learning rules Compaction Level How aggressively to compress rules

1-3 Candidate Retrieval Find potential column matches

Combines BM25 keyword search with semantic embeddings to find candidate database columns. Uses hybrid search with Reciprocal Rank Fusion (RRF) to balance exact keyword matches with semantic similarity.

RRF Constant Reciprocal Rank Fusion constant for hybrid search (higher = more weight on position) Default K Default number of candidates for very large databases

Candidates by Database Size

Small DB K (≤50 cols) Medium DB K (≤200 cols) Large DB K (≤500 cols) XLarge DB K (≤1000 cols)

Adaptive Weights (Dense vs BM25)

Normal Query - Dense: 0.60 Embedding weight for normal queries Normal Query - Sparse: 0.40 BM25 weight for normal queries Acronym Query - Dense: 0.40 Embedding weight when query has acronyms (FTE, GHG, ESG) Acronym Query - Sparse: 0.60 BM25 weight when query has acronyms

4 AI Reranking Filter candidates with AI

Uses AI to re-score and filter candidate columns based on question context and classification. Narrows down from K candidates to a smaller set for detailed matching.

Target Count Target number of candidates after reranking Min Count Minimum candidates to keep (ensures backup options) Max Count Maximum candidates to keep (hard cap) Fallback Penalty: 0.80 Score multiplier for non-selected candidates

5-6 Main Matching & Certainty Core matching algorithm

Primary GPT-4o chain-of-thought matching that analyzes candidates and selects the best match(es). Calculates certainty score combining embedding similarity and AI confidence.

Max Columns per Answer Maximum columns AI can return per question (1=single match only, 2-5=allow multi-column when question requires distinct data points) Min Similarity (soft): 0.30 Soft threshold - candidates below this are deprioritized but not rejected Min Acceptable Similarity (hard): 0.45 Hard floor - if best match is below this, return "No Data"

Certainty Level Thresholds

Certainty score determines the confidence level: Incertain < Probable < Certain. Below the empty answer threshold, the answer is left empty.

Empty Answer Threshold: 0.42 Below this = empty answer (certainty still computed) Uncertain Threshold: 0.55 Below this = "Uncertain" level Probable Threshold: 0.75 Below this = "Probable", above = "Certain"

Certainty Score Weights

Controls how certainty score is calculated. Weights must sum to 1.0.

Similarity Weight: 0.50 Weight for embedding similarity in certainty calculation Confidence Weight: 0.50 Weight for AI confidence in certainty calculation

Always Answer (ignore NO DATA threshold) Validation Penalty: 0.85 Certainty multiplier when validation detects issues (15% penalty at 0.85)

7 Self-Consistency Sampling Verify with multiple samples

When initial confidence is low, generates multiple samples with different temperatures and uses voting to verify the answer. Helps catch unstable matches where AI gives different answers.

Additional Samples Extra samples for voting (total = 1 + this). More = better but slower. Low Temperature: 0.10 Temperature for main/initial match (lower = more deterministic) High Temperature: 0.30 Temperature for additional samples (higher = more variation) Consensus Ratio: 0.60 Minimum agreement ratio for consensus (e.g., 60% must agree)

Confidence Bonus Settings

Bonus Min: 0.80 Minimum multiplier for average confidence Bonus Base: 0.80 Base weight for final confidence calculation Bonus Multiplier: 0.20 Consensus bonus multiplier (higher = more reward for agreement)

8 Ambiguity Detection Find alternative matches

Detects when multiple database columns could reasonably answer the question. Flags high/medium ambiguity for human review and lists alternative matches.

High Threshold: 0.60 Score ≥ this = HIGH ambiguity (definitely needs human review) Medium Threshold: 0.30 Score ≥ this = MEDIUM ambiguity (may need review) Alternative Count Maximum alternative matches to show in output

9-10 Validation & Retry Verify answer quality

Validates the selected answer for scope/type/semantic mismatches. If validation finds issues and confidence is below threshold, re-runs matching with feedback.

Default Confidence: 0.50 Fallback confidence when JSON parsing fails High Threshold: 0.80 Skip self-consistency sampling if confidence > this Review Threshold: 0.60 Flag for human review if confidence < this Validation Trigger: 0.80 Run validation stage if confidence < this Validation Retry: 0.70 Accept validation retry result if confidence ≥ this High Confidence Bypass: 0.95 Skip validation entirely if confidence ≥ this

G API Retry Settings Handle rate limits

Controls retry behavior for failed API calls due to rate limits or server errors. Uses exponential backoff with jitter to handle Azure's 60-second token-per-minute limits.

Max Retries Maximum retry attempts on 429/5xx errors Base Delay (ms) Base delay for exponential backoff Max Delay (ms) Maximum delay cap (60000ms = 60s for Azure TPM) Jitter Ratio: 0.25 Random variation to prevent synchronized retries (0-25%)

G Embedding & JSON Parsing Technical settings

Controls embedding generation batch size and JSON response parsing/repair behavior.

Embedding Batch Size Columns per embedding API call (larger = fewer calls but more latency) Enable JSON repair Log malformed JSON Max Log Length Maximum characters to log for debugging

Step 3: Configure & Run

Run Options

Max Questions Parallelism API Delay (ms)

Confidence Thresholds

NO DATA Threshold: -- Min Similarity: -- Always answer (ignore NO DATA)

Cache Options

Disable response cache Disable embedding cache Clear all caches before run

Results

#	Metric	Answer	Match Name	Expected	Expected Name	Certainty	Score	Status

Output File & Logs Documentation

Output Excel File

The downloaded Excel file contains a single sheet named Results. Column headers are in row 1; data starts at row 2 with one row per question.

Col	Header	Type	Values / Format	Description
A	Row ID	Number	e.g. 1, 2, 3	Original row identifier from the questions file
B	Metric	String	Free text	Question metric name (short label)
C	Definition	String	Free text	Full question text / definition
D	Answer	String	Column letter(s), space-separated	Matched database column letter(s). Empty if no match. Up to 3 columns.
E	Match Name	String	Pipe-separated names	Full name(s) of matched column(s) including unit. Empty if no match.
F	Verification	String	Free text	Proof text from column descriptions showing why this match is correct
G	Explanation	String	Free text	Chain-of-thought reasoning from the AI explaining the matching logic
H	Confidence	Number	0.00 – 1.00	AI's self-assessed confidence in the match
I	Needs Review	String	YES / NO	Human review flag. YES when: confidence < review threshold, samples disagree, ambiguity is MEDIUM/HIGH, or validation finds issues
J	Certainty Score	Number	0.00 – 1.00	Combined score = (embedding similarity × weight) + (AI confidence × weight)
K	Certainty Level	String	Certain / Probable / Uncertain / No Data	Categorical level derived from certainty score thresholds
L	Expected Answer	String	Column letter(s)	Optional — expected correct answer (only when validation file provided)
M	Expected Name	String	Full name(s)	Optional — full name(s) of expected column(s) (only when validation file provided)

Logs ZIP

The downloaded ZIP contains one timestamped directory with per-question subdirectories:

{YYYY-MM-DD - HH:MM}/
├── run-summary.json
├── Question 01 - 5/
│   ├── ia-calls.json
│   ├── Logs.json
│   ├── 01-QueryExpansion-Prompt.txt
│   ├── 01-QueryExpansion-Response.txt
│   ├── 06-MainMatching-Prompt.txt
│   ├── 06-MainMatching-Response.txt
│   └── ...
├── Question 02 - 6/
│   └── ...

File	Description
`run-summary.json`	Overall run statistics: question count, total tokens, cost, duration
`ia-calls.json`	Consolidated log of all AI calls for a question: model, tokens, cost, duration, cache status, parsed results
`Logs.json`	Legacy metadata: question info, candidate count, matching attempts with temperature and confidence
`{NN}-{Stage}-Prompt.txt`	Raw prompt text sent to the AI at each pipeline stage
`{NN}-{Stage}-Response.txt`	Raw AI response at each pipeline stage

AI Pipeline Stages

#	Stage	Description
00	LearningNormalization	Normalize and deduplicate learning rules
01	QueryExpansion	Expand question for better embedding search
02	ColumnInjection	Force-include columns from learning rules
03	PreClassification	Classify question type, scope, ESG category
04	Embedding	Semantic similarity search
05	Reranking	AI re-scoring of candidates
06	MainMatching	Primary GPT-4o column matching
07	SelfConsistency	Additional samples for verification
08	AmbiguityDetection	Detect alternative matches
09	Validation	Verify scope/type alignment
10	ValidationRetry	Retry with correction if validation fails
11	ColumnClarification	Disambiguate similar columns

MAPL - ESG Question Matcher

Step 1: Upload Files & Configure Structure

Step 2: Validation & Configuration

Database Columns Preview

Questions Preview

Correct Answers Preview

Pipeline Configuration

Step 3: Configure & Run

Progress

Live Logs

Results

Summary

Output Excel File

Logs ZIP

Prompt & Response Viewer

Prompt

Response