Skip to main content
GET
/
confidence-scoring
/
jobs
Python
from meibel import MeibelClient

client = MeibelClient(api_key="your-api-key")

result = client.confidence_scoring.list_scoring_jobs()
print(result)
[
  {
    "job_id": "<string>",
    "agent_identity_context": {
      "customer_id": "<string>",
      "project_id": "<string>",
      "agent_name": "<string>",
      "agent_version": "<string>",
      "agent_session_id": "<string>",
      "agent_turn": 123,
      "agent_workflow_name": "<string>",
      "agent_workflow_version": "<string>",
      "agent_workflow_session_id": "<string>",
      "batch_definition_id": "<string>",
      "batch_execution_id": "<string>",
      "tool_id": "<string>",
      "tool_instance_id": "<string>",
      "tool_execution_id": "<string>"
    },
    "module": "<string>",
    "status": "<string>",
    "score": 123,
    "explanation": "<string>"
  }
]

Authorizations

Meibel-API-Key
string
header
required

Query Parameters

agent_name
string | null

Filter by agent name.

agent_version
string | null

Filter by agent version.

agent_session_id
string | null

Filter by agent session ID.

agent_workflow_name
string | null

Filter by workflow name.

agent_workflow_version
string | null

Filter by workflow version.

agent_workflow_session_id
string | null

Filter by workflow session ID.

tool_id
string | null

Filter by tool identifier.

tool_instance_id
string | null

Filter by tool instance identifier.

tool_execution_id
string | null

Filter by tool execution identifier.

Response

Successful Response

job_id
string
required

Unique identifier for this scoring job.

agent_identity_context
AgentIdentityContext · object
required

The agent, workflow, and tool context that produced the scored output.

module
string
required

The scoring module used to evaluate the output. Judge-based modules (e.g. correctness, coherence, faithfulness) produce scores on a 0–10 scale. Statistical modules (e.g. observed_consistency, data_grounding) produce scores on a 0.0–1.0 scale.

status
string
required

Current status of the scoring job: submitted, in_progress, completed, failed, or not_run.

score

The computed confidence score, or null if the job has not completed. Range depends on the module: 0–10 (integer) for judge-based modules, 0.0–1.0 for statistical modules.

explanation
string | null

Human-readable explanation of the score.