Toronto Chatbots

At Toronto Chatbots we are the go to service for developing custom, highly functional chatbots in Toronto and beyond. We specialize in developing Custom GPTs, providing Amazon Q integration and fine tuning LLMs and SLMs as well as RAG and distillation. Chatbots can be applied to a multitude of uses including content creation and data analysis as well as industries such as digital marketing, finance and legal. These chatbots are trained on your data so in effect, you are talking to your data which can save a great amount of time with impressive results. Common types of data that can be imported into your chatbot include PDFs, news articles, books, computer code, emails, chat logs, spreadsheets, URLs, images and many more. Reach out to us now for a free consultation and don’t forget to ask about our other data services.

Amazon Q

Amazon Q allows business owners to access companies' content and data which enables a fast and easy way to get relevant answers to important questions, solve problems and generate content. This is a great option for people who are already part of the Amazon ecosystem. Amazon Q can be easily and securely connected to commonly used systems and tools. Pricing for hosting is very reasonable.

Custom GPTs

Since its inception three years ago, ChatGPT has taken the world by storm. The application has reached 800 million weekly active users. To create a custom GPT you need a Plus plan which is also reasonably priced. One of the common misconceptions is that GPT knows everything. This is not the case and which is why it is important to train it on your own data. Custom GPTs can be private or public.

How it Works

Custom GPTs and Amazon Q are custom trained large language models (LLM). LLMs are deep learning models that use neural networks to process large amounts of data. Neural networks mimic how the human brain works. One of the key innovations that enable LLMs is the self-attention mechanism which determines which parts of text are more important and where to focus the models attention. LLMs use a statistical process to determine the next word in a sequence. Another discovery that enables LLMs is word embeddings. Basically, each word is assigned an embedding which is a list of approximately one thousand numbers. Similar words will have similar embeddings.

Fine Tuning

Fine tuning takes a Pre trained language model and adjusts the model weights to create a language model trained on your custom data. This involves training the model on up to 10,000 question/answer pairs. We work closely with our clients to create the most effective question answer pairs from your data. We usually fine tune a small language model because for most cases you don’t need your model to know everything about everything and a small language model is more efficient and cost effective to run.

RAG

RAG (Retrieval Augmented Generation) involves connecting a data store to your model which is then tokenized and turned into embeddings and then stored in a vector database. During inference, the model tries to match the user query, which is also converted into embeddings, with similar embeddings in the vector database. Finally, the most relevant answer (or answers) is retrieved. You can also combine fine tuning with RAG. For instance, you could use the RAG data in one model and then also fine tune a separate model with examples of your model's desired personality. The two models are then combined during inference.

Distillation

Distillation is another amazing technique which is becoming increasingly popular. It involves using an LLM to create question/answer pairs and then fine tuning a SLM with that information. Using a SLM instead of a LLM is cheaper and more efficient.