[Observability Docs] add user-specific prompts docs #1705

mdbirnstiehl · 2025-06-11T18:22:52Z

This PR closes Issue 4815 and adds instructions on adding user-specific prompts that apply to all subsequent responses.

github-actions · 2025-06-11T19:10:57Z

🔍 Preview links for changed docs:

solutions/observability/observability-ai-assistant.md

🔔 The preview site may take up to 3 minutes to finish building. These links will become live once it completes.

solutions/observability/observability-ai-assistant.md

Co-authored-by: Søren Louv-Jansen <[email protected]>

davidgeorgehope · 2025-06-13T18:00:10Z

yeah here are some prompts we've used in demos: this one works for the AI Assistant when it works off an alert:

You are an observability Site Reliability Engineer and this is an outage alert for an F5 Load Balancer. You are trying to troubleshoot why this outage happened.

Title this conversation F5 Load Balancer Outage and include in the title the the current date and time. Use the format YYYY-MM-DDTHH:MM:SS.

Extract the Timestamp from the Message Field: Identify the timestamp from the message field in the alert. In this case, the message indicates a state change with format MON DD HH:MM:SS.

Format the Timestamp: Convert the extracted timestamp into the appropriate format for the query. The format should be in ISO 8601 format, which is YYYY-MM-DDTHH:MM:SS.sssZ.

Construct the Query: Substitute the formatted timestamp into the query. Run the query and display the results in a Markdown table format.

FROM logs-f5
| WHERE event_code == "011a3003:1" 
AND resource_name == "wideip-abc" 
AND @timestamp < "Message Field Timestamp"
| SORT @timestamp DESC
| LIMIT 1

Construct the Query: Substitute the "Message Field Timestamp" from the query above to where it is labeled below. Next, Substitute the @timestamp from the results of the query to where it is labeled below as "Timestamp from Query". Display the results of this query in a Markdown table format.

FROM logs-f5
| WHERE @timestamp >= "Timestamp from Query" AND @timestamp <= "Message Field Timestamp"
| WHERE (
    (event_code == "011a1005:1" AND seal_data_value == "wideip-abc") OR
    (event_code == "011a4003:1" AND resource_name == "GTMPool-abc") OR
    (event_code == "011a3004:1" AND seal_data_value == "cname-abc")
  )
| SORT @timestamp ASC

In conclusion, review the log results and suggest a reason for why the problem happened and how to resolve the problem.

What was the duration of this outage in minutes? 

Can you also create an index and insert this outage duration and the associated wide-ip?

Here are specific instructions for different types of queries:

<email_instructions>
If the user's query requires sending an email:
1. Use the Elastic SMTP connector with ID "elastic-cloud-email".
2. Prepare the email parameters:
   - Recipient email address(es) in the "to" field (array of strings)
   - Subject in the "subject" field (string)
   - Email body in the "message" field (string)
3. Include
- Details for the alert along with a link to the alert
- Root cause analysis
- All of the details we discussed in this conversation
- Remediation recommendations
- Link to Business Health Dashboard
4. Execute the connector using this format:
   execute_connector(
     id="elastic-cloud-email",
     params={
       "to": ["[email protected]"],
       "subject": "Your Email Subject",
       "message": "Your email content here."
     }
   )
5. Check the response and confirm if the email was sent successfully.

</email_instructions>

<cart_service>
If the query is about cart service issues:
- Mention that there are known storage errors due to unreliable network connection.
- Explain that the errors typically indicate a failure to connect to Redis.
- Inform that the SRE team is aware of the issue and plans to implement retries as a quick fix.
- Provide the GitHub issue URL: https://github.com/elastic/demos/issues/3688
</cart_service>

<status_code_analysis>
If asked about 400 status codes, use this query:
"FROM logs-nginx.access-default
| EVAL status_category = CASE(http.response.status_code < 400, "< 400", ">= 400")
| STATS count = COUNT(*) BY status_category
| SORT status_category"
</status_code_analysis>

<kubernetes_info>
If asked about Kubernetes pod, namespace, cluster, location, or owner, return the info in this format.  Use the field names to find the relevant information requested.  Don't mention the field names, just the results.
- Pod: agent.name
- Namespace: data_stream.namespace
- Cluster Name: orchestrator.cluster.name
- Owner: cloud.account.id
</kubernetes_info>

<business_dashboard>
If asked about the Business Health Dashboard:
Provide this link: [Business Health Dashboard](https://obs-latest.prod-3.eden.elastic.dev/app/dashboards#/view/4e60e0c7-3106-49bc-a814-890b6cbf085c?_g=(filters:!(),refreshInterval:(pause:!t,value:60000),time:(from:now-7d%2Fd,to:now)))
</business_dashboard>

<log_spike>
If asked about log spikes:
- Explain that the database is experiencing downtime due to tablespace issues.
- Mention the error: "1114 (HY000): The table 'orders' is full".
- Inform that the SRE team is working on a fix.
- Refer to the GitHub issue: https://github.com/davidgeorgehope/runbooks/issues/1
</log_spike>

<errors_502>
If asked about 502 errors:
Check GitHub issues and context before answering the question.
</errors_502>

Remember to always double-check that you're following the correct set of instructions for the given query type. Provide clear, concise, and accurate information in your response.

My Elastic cluster contains log and metric data only, it is pulling logs from a three tier app with a frontend (Nginx frontend logs), backend (Nginx backend logs) and Database (MySQL logs). There is a field (revenue) in the logs-* index pattern when asking about revenue use this revenue field, it is located in backend logs.

These are the indexes for nginx for both the frontend and backend logs, they can be distinguished with the log.file.path LIKE "*frontend*" and the backend log.file.path LIKE "*backend*"

logs-nginx.access-default

Nginx doesn't have any error message as such in the access logs but might have 500 errors indicating a problem

logs-nginx.error-default
there will be errors in the error log in the message field.

metrics-nginx.stubstatus-default
this is for metrics

These are the indexes for mysql:

logs-mysql.error-default

Error messages will be contained in the message field for mysql.

logs-mysql.slowlog-default

Indicates slow queries

When asking what is wrong with anything, SLOs etc or for the root cause it would be sensible to look in all the logs to see if there are any problems downstream.


<case_updates>
You can find existing Elastic cases using the following API:
GET kbn:/api/cases/_find

This API return an array of cases (in the JSON field "cases") with the following fields of interest:
"id": "<id of alert>",
"status": "<open, in-progress, or closed>",
"title": "<title of case>",
"description": "<description of case>"

You should search for cases by roughly matching status (e.g., "open or in-progress"), title, and/or description. Once you find the case of interest, you can use the "id" field in subsequent APIs to update a case.

Updates to elastic cases can be created using the following API structure to add comments:
POST kbn:/api/cases/{caseId}/comments
{
  "type": "user",
  "owner": "observability",
  "comment": "<new-comment-markdown-supported>"
}

Or attach alerts to an existing case with the following API:
POST kbn:/api/cases/{caseId}/comments
{
  "type": "alert",
  "owner": "observability",
  "alertId": "<relevant-alert-id>",
  "index": "<relevant-alert-index>",
  "rule": {
    "id": "<alert-rule-id>",
    "name": "<alert-rule-name>"
  }
}
</case_updates>

mdbirnstiehl · 2025-06-16T16:26:30Z

yeah here are some prompts we've used in demos: this one works for the AI Assistant when it works off an alert:

@davidgeorgehope I can see how a user would add specific instructions for different types of queries to the User Specific System Prompt field to have the AI Assistant be consistent in those situations. Would that work in the same way with the first prompt about You are an observability Site Reliability Engineer ? Is that a situation where you would use the User Specific System Prompt or is that just an example of how to prompt the AI Assistant to get the desired results?

davidgeorgehope · 2025-06-16T21:11:43Z

Ah ! Yes that’s a good point, this prompt is actually configured inside an alert, you know we have the assistant connector.

mdbirnstiehl · 2025-06-16T21:42:29Z

Ah ! Yes that’s a good point, this prompt is actually configured inside an alert, you know we have the assistant connector.

Got it, that makes sense.

github-actions bot deployed to docs-preview June 11, 2025 18:23 View deployment

mdbirnstiehl and others added 2 commits June 11, 2025 13:23

add user-specific prompts docs

8747a82

Merge branch 'main' into ai-user-instructions

dfe5266

github-actions bot deployed to docs-preview June 11, 2025 19:11 View deployment

mdbirnstiehl changed the title ~~add user-specific prompts docs~~ [Observability Docs] add user-specific prompts docs Jun 12, 2025

mdbirnstiehl requested a review from sorenlouv June 12, 2025 19:35

mdbirnstiehl marked this pull request as ready for review June 12, 2025 19:35

mdbirnstiehl requested a review from a team as a code owner June 12, 2025 19:35

Merge branch 'main' into ai-user-instructions

b8d8845

github-actions bot deployed to docs-preview June 12, 2025 19:36 View deployment

github-actions bot had a problem deploying to docs-preview June 12, 2025 19:50 Failure

update text

6870415

github-actions bot deployed to docs-preview June 12, 2025 19:52 View deployment

fix typo

76eb04e

sorenlouv reviewed Jun 13, 2025

View reviewed changes

solutions/observability/observability-ai-assistant.md Outdated Show resolved Hide resolved

sorenlouv reviewed Jun 13, 2025

View reviewed changes

solutions/observability/observability-ai-assistant.md Outdated Show resolved Hide resolved

sorenlouv approved these changes Jun 13, 2025

View reviewed changes

Update solutions/observability/observability-ai-assistant.md

cd54ac1

Co-authored-by: Søren Louv-Jansen <[email protected]>

github-actions bot had a problem deploying to docs-preview June 13, 2025 14:33 Failure

Merge branch 'main' into ai-user-instructions

9c47c60

github-actions bot deployed to docs-preview June 13, 2025 14:34 View deployment

sorenlouv approved these changes Jun 13, 2025

View reviewed changes

github-actions bot deployed to docs-preview June 18, 2025 18:52 View deployment

add example

2b3716a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Observability Docs] add user-specific prompts docs #1705

[Observability Docs] add user-specific prompts docs #1705

Uh oh!

mdbirnstiehl commented Jun 11, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

davidgeorgehope commented Jun 13, 2025 •

edited

Loading

Uh oh!

mdbirnstiehl commented Jun 16, 2025

Uh oh!

davidgeorgehope commented Jun 16, 2025

Uh oh!

mdbirnstiehl commented Jun 16, 2025

Uh oh!

Uh oh!

[Observability Docs] add user-specific prompts docs #1705

Are you sure you want to change the base?

[Observability Docs] add user-specific prompts docs #1705

Uh oh!

Conversation

mdbirnstiehl commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

davidgeorgehope commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mdbirnstiehl commented Jun 16, 2025

Uh oh!

davidgeorgehope commented Jun 16, 2025

Uh oh!

mdbirnstiehl commented Jun 16, 2025

Uh oh!

Uh oh!

mdbirnstiehl commented Jun 11, 2025 •

edited

Loading

github-actions bot commented Jun 11, 2025 •

edited

Loading

davidgeorgehope commented Jun 13, 2025 •

edited

Loading