Documentation

Getting Started

As you know, AskYourDatabase is a tool that allows you to query your database using natural language. Currently, we have two main use cases for AskYourDatabase:

Customer-facing BI Chatbot

If you need to provide a BI dashboard for your customers, their needs vary, making it hard to create a dashboard that fits all customers' needs.

Continuously adding features to a BI dashboard will make it more complex and harder to use, and your engineers will become busier.

With AskYourDatabase Chatbot, you can provide a chatbot to your customers enabling them to ask for ad-hoc queries, freeing your engineers and providing a better user experience.

For how to create a chatbot, please refer to this doc.

Internal Tools

You must need an admin dashboard to manage or monitor your business, and you may use tools like PowerBI or Retool to build the app and consume it.

But with AskYourDatabase, you typically don't have to build any internal tools. You just have to connect to your database, and you are ready to go for various tasks including:

Business intelligence
Inserting data
Updating data
Reporting
Data analysis

For how to connect to your database, please refer to this doc. You can also view videos created by community members, like this one(opens in a new tab).

Get better answer

Although we have tried our best to enable ChatGPT to understand your database schema and generate SQL queries, there are still cases where ChatGPT may generate incorrect SQL queries or behave poorly.

After reading this document, you will know how to maximize the chances of generating correct SQL queries.

Tip 1: Make Schema Human-Understandable

The better your schema is named and easier to understand, the better ChatGPT can comprehend your schema and generate correct SQL queries.

If you are starting a new database or creating a new table, follow these tips:

Use semantic naming for tables and columns so that ChatGPT can better understand your schema. For example, user is better than u, sales_amount is better than sa.
Use foreign constraints to link tables, so that ChatGPT can understand the relationships between them.

Tip 2: Add Descriptive Comments

Add descriptive comments to schema when it's not self-explanatory or has implicit conventions, like:

Some columns have enum values, each with a specific meaning. For example:

You have a table storing all "orders", and each order has a status, which can be "pending", "paid", "shipped", "delivered", or "cancelled".

If your order data type is a string, ChatGPT will not know your status will have such values, and when you ask "How many orders are not paid yet"?

Although you mean "How many orders have status 'pending'", ChatGPT may generate a query like this:

SELECT COUNT(*) FROM orders WHERE status == 'not paid'

Which is obviously incorrect.

Some columns have implicit meanings or abbreviations that are hard to understand, like "tz_location", "tblRep", "WS_MAPPING". It's hard to know what they mean without context.

For all these cases where your schema itself does not convey enough information, we recommend you add comments to your schema to help ChatGPT understand your schema better.

Adding a comment is easy. For example, to add a comment to the "status" column, just say:

Add comment to orders->status column: "pending", "paid", "shipped", "delivered", "cancelled"

If "WS_MAPPING" means "Workstation Mapping", just say:

Add comment to WS_MAPPING column: "WS_MAPPING" means "Workstation Mapping"

The principle is:

If there's something implicit or hard to understand in your schema but needed for generating better SQL, just add a comment to explain it in a descriptive way with enough context.

If you do not know your schema well, you could let your DBA or someone who knows your schema well add comments to your schema.

Tip 3: Debug SQL and Give Feedback

If you keep getting wrong answers from some SQL queries, you can look into the generated SQL to see what's wrong (If you lucikly know SQL):

If you luckily know what's wrong with the SQL, it's better to give feedback to ChatGPT, so that ChatGPT can learn from the mistake and generate better SQL next time.

Tip 4: Ask Good Questions

Don't be vague and ambiguous; be specific and clear.

If you could say "How many distinct locations are there in the customer table?", don't just say "How many locations are there".

If you know the table name and column name, it's better to mention them, as that helps a lot in generating correct SQL.

Tip 5: Prevent too long Questions

ChatGPT has a limited context window, so submitting lengthy SQL queries or extensive articles can lead to a context limit error.

Prepare Your Database Connection String

We currently support four types of databases: MySQL, PostgreSQL, SQL Server, Snowflake, and Vertica.

Here are some examples:

MySQL

mysql://user:password@host:port/database

PostgreSQL

postgresql://user:password@host:port/database

MongoDB

mongodb://user:password@host:port/database

SQL Server

Data Source=host;Initial Catalog=database;User ID=user;Password=password;

We use ADO.NET format connection strings - for more details, please refer to Microsoft's official documentation(opens in a new tab).

Vertica

vertica://user:password@host:port/database

If you encounter any issues while preparing your connection string, we recommend using this utility tool(opens in a new tab).

If you don't have a database and just want to test, you can apply for a free PostgreSQL database at Neon DB(opens in a new tab).

Whitelist Our Static IP (Only Required for Website Chatbot)

You need to whitelist our server IP 43.159.146.204 to allow the Website Chatbot to connect to your database successfully.

If you're unsure how to whitelist an IP, please contact your database provider or your database administrator. (Or ask ChatGPT directly).

If you are using the Desktop app, you can skip this step. (The Desktop app connects to your database from your local computer).

Security Portal

At AskYourDatabase, we offer two main products: the Desktop App and the Chatbot. Both are designed with strong security measures to protect your sensitive data. Here's how each product operates and handles your information:

Desktop App: Local Processing for Maximum Security

Our Desktop App is designed to keep your data as close to you as possible:

Local Database Connection: Your database credentials are stored locally on your computer. The connection to your database is established directly from your local machine.
OpenAI API Interaction: When you ask a question, it's sent to OpenAI's API through our secure gateway. OpenAI generates SQL queries or natural language responses.
Local Query Execution: If a SQL query is generated, it's executed locally on your machine against your database.
Minimal Data Transfer: Only the necessary response data is sent back to OpenAI for further analysis and explanation.
No Data Storage: Our gateway does not store any intermediate data, especially your conversation information.
OpenAI's Privacy Commitment: OpenAI has committed not to use API Platform conversation data for model training. For more details, visit OpenAI's Enterprise Privacy page(opens in a new tab).

In essence, with the Desktop App, your database credentials never leave your local environment, and only the conversation data is sent to OpenAI's API.

Chatbot: Cloud-Based Solution with Robust Security Measures

Our Chatbot product operates in the cloud, requiring a different set of security measures:

Secure Credential Handling: You provide us with your database credentials, which we encrypt using a private key. We never store these credentials in plain text.
Whitelisted Access: You need to whitelist our fixed IP Gateway service in your database firewall.
TLS Encryption: All connections between our service and your database occur over TLS, ensuring data in transit is secure.
Access Control Recommendations: We highly recommend whitelisting only our IP and using a read-only user if only SELECT queries are needed.
Query Sanitization: We sanitize AI-generated SQL queries to prevent potential security issues.
Customizable Access: You can disable access to specific tables and implement row-level policies to restrict user-level permissions.
Data Storage: Due to the nature of the Chatbot, we do store conversation records and your encrypted credentials.
Enterprise Solutions: We offer enterprise-grade private deployment solutions for organizations with stricter data locality requirements.
Data Deletion Rights: You have the right to request deletion of all your data stored on our platform.

Data Collection Summary

Desktop App

Collects: User queries
Does not collect: Database credentials, query results

Chatbot

Collects: User queries, conversation history, encrypted database credentials
Does not collect: Plain text database credentials, full query results

Both products are designed with your data security in mind, offering different levels of control and protection based on your specific needs and requirements.

For more information on how to get started with AskYourDatabase, please refer to our Getting Started guide.

Commitment to Security and Compliance

At AskYourDatabase, we are committed to maintaining the highest standards of security and compliance to protect your data. We are pleased to announce that we have initiated the SOC 2 Type 2 audit process, demonstrating our dedication to robust security practices and data protection.

SOC 2 Type 2 Audit

We have begun the rigorous process of obtaining SOC 2 Type 2 compliance certification. This comprehensive audit examines our security controls, processes, and procedures over an extended period, typically 6-12 months. The SOC 2 Type 2 report will provide detailed information about how we manage customer data, focusing on the security, availability, processing integrity, confidentiality, and privacy of our systems.

We anticipate receiving our first complete SOC 2 Type 2 compliance report in January 2025. This timeline allows for a thorough examination of our systems and practices, ensuring that we meet the stringent requirements set forth by the American Institute of Certified Public Accountants (AICPA).

By pursuing SOC 2 Type 2 compliance, we aim to:

Validate our existing security measures
Identify and address any potential vulnerabilities
Provide our customers with additional assurance regarding our data protection practices
Demonstrate our ongoing commitment to maintaining a secure and compliant environment

We look forward to sharing more details about our SOC 2 Type 2 compliance journey as we progress through the audit process. This certification will be a significant milestone in our continuous efforts to ensure the security and privacy of your data.

Want to discover more?

We'd love to talk about how we can help you.