Cocoon thoroughly prepares your data for RAG. Specifically, Cocoon helps document, connect, and optimize your data pipelines offline. The result can be used for online RAG in use cases like pipeline copilots and data transformation. Check out the YouTube demo 👇:
Get Started
- 👉 Try this Google Collab Notebook for Data Warehouse RAG
- 👉 Try this Google Collab Notebook for Data Pipeline RAG
Cocoon is available on PyPI:
To get started, you need to connect to
- LLMs (e.g., GPT-4, Claude-3, Gemini-Ultra, or your local LLMs)
- Data Warehouses (e.g., Snowflake, Big Query, Duckdb...)
from cocoon_data import * # if you use Open AI GPT-4 openai.api_key = 'xycabc' # if you use Snowflake con = snowflake.connector.connect(...) query_widget, cocoon_workflow = create_cocoon_workflow(con) # a helper widget to query your data warehouse query_widget.display() # the main panel to interact with Cocoon cocoon_workflow.start()
🎉 You shall see the following on a notebook: