-
Notifications
You must be signed in to change notification settings - Fork 79
LCORE-1422: Inline RAG (BYOK) e2e tests #1370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
b693e5a
15d24b3
b43419e
420c8b8
34b34fd
3f5f0c9
c58bbe1
ee3a48f
cda64ae
d14567a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,38 @@ | ||
| name: Lightspeed Core Service (LCS) | ||
| service: | ||
| host: 0.0.0.0 | ||
| port: 8080 | ||
| auth_enabled: false | ||
| workers: 1 | ||
| color_log: true | ||
| access_log: true | ||
| llama_stack: | ||
| use_as_library_client: false | ||
| url: http://${env.E2E_LLAMA_HOSTNAME}:8321 | ||
| api_key: xyzzy | ||
| user_data_collection: | ||
| feedback_enabled: true | ||
| feedback_storage: "/tmp/data/feedback" | ||
| transcripts_enabled: true | ||
| transcripts_storage: "/tmp/data/transcripts" | ||
|
|
||
| conversation_cache: | ||
| type: "sqlite" | ||
| sqlite: | ||
| db_path: "/tmp/data/conversation-cache.db" | ||
|
|
||
| authentication: | ||
| module: "noop" | ||
|
|
||
| byok_rag: | ||
| - rag_id: e2e-test-docs | ||
| rag_type: inline::faiss | ||
| embedding_model: sentence-transformers/all-mpnet-base-v2 | ||
| embedding_dimension: 768 | ||
| vector_db_id: ${env.FAISS_VECTOR_STORE_ID} | ||
| db_path: ${env.KV_RAG_PATH:=~/.llama/storage/rag/kv_store.db} | ||
| score_multiplier: 1.0 | ||
|
|
||
| rag: | ||
| inline: | ||
| - e2e-test-docs |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,40 @@ | ||
| name: Lightspeed Core Service (LCS) | ||
| service: | ||
| host: 0.0.0.0 | ||
| port: 8080 | ||
| auth_enabled: false | ||
| workers: 1 | ||
| color_log: true | ||
| access_log: true | ||
| llama_stack: | ||
| use_as_library_client: true | ||
| library_client_config_path: run.yaml | ||
| user_data_collection: | ||
| feedback_enabled: true | ||
| feedback_storage: "/tmp/data/feedback" | ||
| transcripts_enabled: true | ||
| transcripts_storage: "/tmp/data/transcripts" | ||
|
|
||
| conversation_cache: | ||
| type: "sqlite" | ||
| sqlite: | ||
| db_path: "/tmp/data/conversation-cache.db" | ||
|
|
||
| authentication: | ||
| module: "noop" | ||
| inference: | ||
| default_provider: openai | ||
| default_model: gpt-4o-mini | ||
|
|
||
| byok_rag: | ||
| - rag_id: e2e-test-docs | ||
| rag_type: inline::faiss | ||
| embedding_model: sentence-transformers/all-mpnet-base-v2 | ||
| embedding_dimension: 768 | ||
| vector_db_id: ${env.FAISS_VECTOR_STORE_ID} | ||
| db_path: ${env.KV_RAG_PATH:=~/.llama/storage/rag/kv_store.db} | ||
| score_multiplier: 1.0 | ||
|
|
||
| rag: | ||
| inline: | ||
| - e2e-test-docs |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| name: Lightspeed Core Service (LCS) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. create a copy of these new configs also in e2e-prow
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. addressed in new commit |
||
| service: | ||
| host: 0.0.0.0 | ||
| port: 8080 | ||
| auth_enabled: false | ||
| workers: 1 | ||
| color_log: true | ||
| access_log: true | ||
| llama_stack: | ||
| use_as_library_client: false | ||
| url: http://llama-stack:8321 | ||
| api_key: xyzzy | ||
| user_data_collection: | ||
| feedback_enabled: true | ||
| feedback_storage: "/tmp/data/feedback" | ||
| transcripts_enabled: true | ||
| transcripts_storage: "/tmp/data/transcripts" | ||
|
|
||
| conversation_cache: | ||
| type: "sqlite" | ||
| sqlite: | ||
| db_path: "/tmp/data/conversation-cache.db" | ||
|
|
||
| authentication: | ||
| module: "noop" | ||
| inference: | ||
| default_provider: openai | ||
| default_model: gpt-4o-mini | ||
|
|
||
| byok_rag: | ||
| - rag_id: e2e-test-docs | ||
| rag_type: inline::faiss | ||
| embedding_model: sentence-transformers/all-mpnet-base-v2 | ||
| embedding_dimension: 768 | ||
| vector_db_id: ${env.FAISS_VECTOR_STORE_ID} | ||
| db_path: ${env.KV_RAG_PATH:=~/.llama/storage/rag/kv_store.db} | ||
| score_multiplier: 1.0 | ||
|
|
||
| rag: | ||
| inline: | ||
| - e2e-test-docs | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -14,7 +14,7 @@ Feature: FAISS support tests | |
| """ | ||
| { | ||
| "rags": [ | ||
| "{VECTOR_STORE_ID}" | ||
| "e2e-test-docs" | ||
| ] | ||
| } | ||
| """ | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,72 @@ | ||
| Feature: Inline RAG (BYOK) support tests | ||
|
|
||
| Background: | ||
| Given The service is started locally | ||
| And The system is in default state | ||
| And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva | ||
| And REST API service prefix is /v1 | ||
| And The service uses the lightspeed-stack-inline-rag.yaml configuration | ||
| And The service is restarted | ||
|
|
||
| Scenario: Check if inline RAG source is registered | ||
| When I access REST API endpoint rags using HTTP GET method | ||
| Then The status code of the response is 200 | ||
| And the body of the response has the following structure | ||
| """ | ||
| { | ||
| "rags": [ | ||
| "e2e-test-docs" | ||
| ] | ||
| } | ||
| """ | ||
|
|
||
| Scenario: Query with inline RAG returns relevant content | ||
| When I use "query" to ask question with authorization header | ||
| """ | ||
| {"query": "What is the title of the article from Paul?", "system_prompt": "You are an assistant. Write only lowercase letters"} | ||
| """ | ||
| Then The status code of the response is 200 | ||
| And The response should contain following fragments | ||
| | Fragments in LLM response | | ||
| | great work | | ||
| And The response should contain non-empty rag_chunks | ||
|
|
||
| Scenario: Inline RAG query includes referenced documents | ||
| When I use "query" to ask question with authorization header | ||
| """ | ||
| {"query": "What does Paul Graham say about great work?"} | ||
| """ | ||
| Then The status code of the response is 200 | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why did you skip the verification of the response content in this case?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thought separate tests should test different things, but I can update the test to get more consistency in the results |
||
| And The response should contain non-empty referenced_documents | ||
|
|
||
| Scenario: Streaming query with inline RAG returns relevant content | ||
| When I use "streaming_query" to ask question with authorization header | ||
| """ | ||
| {"query": "What is the title of the article from Paul?", "system_prompt": "You are an assistant. Write only lowercase letters"} | ||
| """ | ||
| Then The status code of the response is 200 | ||
| And I wait for the response to be completed | ||
| And The streamed response should contain following fragments | ||
| | Fragments in LLM response | | ||
| | great work | | ||
|
|
||
| Scenario: Responses API with inline RAG returns relevant content | ||
| When I use "responses" to ask question with authorization header | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add here also test where stream is set to true to cover all existing cases
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. addressed in new commit, TY |
||
| """ | ||
| {"input": "What is the title of the article from Paul?", "model": "{PROVIDER}/{MODEL}", "stream": false, "instructions": "You are an assistant. Write only lowercase letters"} | ||
| """ | ||
| Then The status code of the response is 200 | ||
| And The response should contain following fragments | ||
| | Fragments in LLM response | | ||
| | great work | | ||
|
|
||
| Scenario: Streaming Responses API with inline RAG returns relevant content | ||
| When I use "responses" to ask question with authorization header | ||
| """ | ||
| {"input": "What is the title of the article from Paul?", "model": "{PROVIDER}/{MODEL}", "stream": true, "instructions": "You are an assistant. Write only lowercase letters"} | ||
| """ | ||
| Then The status code of the response is 200 | ||
| And I wait for the response to be completed | ||
| And The streamed response should contain following fragments | ||
| | Fragments in LLM response | | ||
| | great work | | ||
Uh oh!
There was an error while loading. Please reload this page.