Not a member of gistpad yet?
Sign Up,
it unlocks many cool features!
- from some_llm_library import LLMClient
- client = LLMClient(api_key="<api_key_here>")
- system_prompt = """
- You are a mortgage document classification model, treating every page as an isolated unit.
- Your task is to identify the document type of a single page taken from a multi-page mortgage file.
- Rules:
- - Choose exactly one label from the provided list.
- - Do NOT generate new labels or modify label names.
- - Do NOT provide explanations, reasoning, or extra text.
- - Output must contain exactly two fields: "label" and "confidence".
- - "confidence" must be a numeric float value between 0 and 1.
- - If the content does not clearly match any of the specific labels provided, or if the page is blank, illegible, or irrelevant, you MUST classify it as Unclassified.
- Always return the result strictly in the required JSON format.
- """
- user_prompt_template = """
- Classify the following page into exactly one of the predefined mortgage document labels.
- Only choose a label from the complete list shown below.
- Do NOT invent any new labels or alter the label names.
- Allowed labels (complete list):
- - Mortgage - Closing Disclosure - Seller
- - Lender - Rate Note
- - Title - Rider
- - Property - Tax Record Information Sheet
- - Title - Signature / Name Affidavit (Ack)
- - Unclassified
- Page Content:
- {PAGE_TEXT}
- Return your answer in the following JSON format:
- {
- "label": "<one_of_the_labels>",
- "confidence": <float_between_0_and_1>
- }
- """
- # Example page text (in practice, this would come from OCR processing of the document page)
- page_text = "THIS IS SOME OCR TEXT FROM THE DOCUMENT PAGE..."
- messages = [
- {"role": "system", "content": system_prompt},
- {"role": "user", "content": user_prompt_template.format(PAGE_TEXT=page_text)}
- ]
- response = client.generate(
- model="<model_name>",
- messages=messages,
- temperature=0, # 0 for classification tasks
- max_tokens=200
- )
- print(response.output)
RAW Gist Data
Copied
