ChatGPT and Crunchbase API: How to Create a Plugin from Scratch

Find out how to create plugins for ChatGPT in this complete step-by-step tutorial. Learn how to extend the capabilities of ChatGPT by using the Crunchbase Basic APIs as a data service. This article will guide you through the process of creating a plugin from scratch, providing you with the tools and knowledge you need to customize and enhance ChatGPT responses.

Let’s start at the beginning but if you want to go directly to the result, I provide the repo on GitHub. If you want to explore other options with GPT, take a look at this other page.

What are ChatGPT Plugins and what can they do?

As a developer, I am always looking for ways to extend and improve the capabilities of the tools I use. Recently, I have been working with OpenAI’s ChatGPT and have discovered a fascinating feature: plugins.

A ChatGPT plugin is essentially an extension that allows ChatGPT to interact with APIs, databases and other services. This means that I can customize and enhance ChatGPT’s capabilities, allowing it to provide richer and more accurate information.

For example, you could create a plugin that allows ChatGPT to interact with a weather forecast API. This would allow ChatGPT to provide real-time responses to weather queries. Or you could create a plugin that allows ChatGPT to interact with a recipe database, which would allow ChatGPT to suggest recipes based on the ingredients at hand.

Tutorial Objective

The main objective of this tutorial is to guide you through the process of creating a plugin for ChatGPT that can search for company information using a third-party service, in this case, the Crunchbase database of companies and funding rounds.

To achieve this goal, we will use several tools:

  • First, we will use GPT4 in ChatGPT for code generation. GPT4 is an advanced version of OpenAI’s text generation technology, which can be used to generate code in a variety of programming languages. More information about GPT-4 can be found on the OpenAI website.
  • In addition, we will use CodeSandbox to deploy our plugin. CodeSandbox is an online platform that allows developers to create, share and deploy web applications. You can learn more about CodeSandbox on its official website.
  • Finally, as I mentioned earlier, we will use the Crunchbase Basic API to search for information about companies. Crunchbase is a platform that provides information on private and public companies, including news, investments and financing. You can learn more about the Crunchbase Basic API in its official documentation.

In general, we are going to follow the indications of the OpenAI post about plugins and in particular, the last video.

Initial configuration

Before we start, we need to configure the tools I was mentioning:

  • ChatGPT and GPT4: to test the plugins, you will need to register as a developer. Follow these instructions to join the waiting list. You may proceed through the tutorial while OpenAI reviews your application.
  • Crunchbase Basic API: necesitaras un CRUNCHBASE_API_KEY así que primero tienes que darte de alta como usuario (es gratis). Una vez registrado sigue estas instrucciones:
    • Go to your personal area (Account Settings), you will see the “Integrations” section and there, Crunchbase API.
    • You can now generate the API_KEY to access the data:
      • These are the endpoints that are available in the BASIC version. We will use the “Organization Search“.
      • The data you can retrieve using the BASIC version can be found here. They are included in the “Organization” scheme.
Crunchbase API
Crunchbase API_KEY generation screen
  • CodeSandbox: Here you can register a free account. If you sign up using your GitHub user, you will be able to link both environments to develop in a more agile way. Once registered, you can fork my virtual machine to have everything ready and be able to modify the code according to your needs.

First step: create your API

In general, plugins have three main elements: your API, the specification of it so OpenAI can understand it, and the manifest that will describe it so chatGPT can use it. Remember that a plugin is nothing more than a tool that chatGPT can use so you will need to correctly describe its functionality.

How would you describe the basic function of the plugin (and therefore of your API)? In this case it would be something like this:

Company search and lookup application that lets the user to search for companies and retrieve information from them using the Crunchbase API. The search for companies will be based on their name.

With this description, we are going to ask ChatGPT to generate the basic skeleton of our API and then improve it. We will use this template to which we have incorporated the general description of our API:

Write a company search and lookup application using FastAPI that lets the user to search for companies and retrieve information from them using the Crunchbase API. The search for companies will be based on their name.

Include a '__main__' section which will run this app using uvicorn. The python module where I save this code will be called 'main.py'.

In addition to the normal endpoints, include a route '/.well-known/ai-plugin.json which serces (as JSON) the contents of './manifest.json', located in the same directory as 'main.py'. Exclude this endpoint from the OpenAPI spec and don't serve any other static content.

The specification of Crunchbase's API is in this swagger https://app.swaggerhub.com/apis-docs/Crunchbase/crunchbase-enterprise_api/1.0.3

You will use the endpoint Search Organizations (POST) to search for companies based on their name and you will fetch the following fields: 'name', 'short-description' and 'website'.

The endpoints you will use from this API will be:
- Search Organizations (POST): https://app.swaggerhub.com/apis-docs/Crunchbase/crunchbase-enterprise_api/1.0.3#/Search/post_searches_organizations

The information with the Organizations' schema is here: https://app.swaggerhub.com/apis-docs/Crunchbase/crunchbase-enterprise_api/1.0.3#/Organization

Let’s see what we have asked for:

  • First, we have described the purpose of the task and the basic requirements for using FastAPI and Crunchbase.
  • We have indicated that the code will be in a file “main.py” and that it will be served using “uvicorn”.
  • Following the OpenAI specifications, we ask that the manifest.json file be served at the url indicated by OpenAI, i.e. /.well-known/ai-plugin.json, to exclude it from the openapi specifications and not to serve any more static content.
  • Here comes the interesting part, using the WebPilot plugin, we are going to give you indications of where you have information to use the Crunchbase API: the Search Organizations POST endpoint and the schema so you know the fields (fields_id) and the operators you can use for the query.

Well, we’re almost there. Here is a link to the whole conversation I had with ChatGPT to generate the code. Keep in mind that there are always small details to be adjusted but we can still ask for it. If you have any questions about the changes, leave me a comment.

After these adjustments, I leave you the code that we will use in CodeSandbox:

from fastapi import FastAPI, HTTPException
from fastapi.responses import JSONResponse
from fastapi.openapi.utils import get_openapi
from pydantic import BaseModel
import requests
import json
import os

app = FastAPI()

class Company(BaseModel):
    name: str

@app.post("/search")
async def search_company(company: Company):
    user_key = os.getenv('CRUNCHBASE_API_KEY')
    url = f"https://api.crunchbase.com/api/v4/searches/organizations?user_key={user_key}"
    data = {
        "field_ids": ["name", "short_description", "website_url", "image_url"],
        "query": [
            {
                "type": "predicate",
                "field_id": "identifier",
                "operator_id": "contains",
                "values": [company.name]
            }
        ],
        "limit": 5
    }
    response = requests.post(url, data=json.dumps(data))
    print("\n",data,"\n")
    print(response.json())
    response_data = response.json()
    if response.status_code == 200:
        # Extract the required fields
        extracted_data = []
        for entity in response_data['entities']:
            extracted_entity = {
                'company_name': entity['properties'].get('name', None),
                'description': entity['properties'].get('short_description', None),
                'website_url': entity['properties'].get('website_url', None),
                'image_url': entity['properties'].get('image_url', None),
            }
            extracted_data.append(extracted_entity)
        # Convert the extracted data to a JSON string
        extracted_data_json = json.dumps(extracted_data)
        print("\n", extracted_data_json)
        
        return extracted_data_json
        # return response.json()
    else:
        raise HTTPException(status_code=400, detail="Unable to fetch data from Crunchbase API")

@app.get("/.well-known/ai-plugin.json", include_in_schema=False)
async def read_manifest():
    try:
        with open('./manifest.json', 'r') as file:
            data = json.load(file)
        return JSONResponse(content=data)
    except FileNotFoundError:
        raise HTTPException(status_code=404, detail="manifest.json not found")

@app.get("/openai.json")
async def get_openapi_json():
    return JSONResponse(get_openapi(
        title="API for Crunchbase ChatGPT Plugin",
        version="0.9.0",
        description="Sample API exposing an enterprise search endpoint using Crunchbase's Basic API as a third-party service",
        routes=app.routes,
    ))


if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="127.0.0.1", port=8000)

Deploying the API in CodeSandbox

Once registered on the platform, you just need to create a new Python-based machine. Once it has finished deploying, you will have access to the files. First of all, you have to include in requirements.txt the libraries we need: fastapi, uvicorn and requests. Remember to restart the virtual machine to install them correctly.

The next step is to update the main.py file with the above code.

The machine should already be serving requests so you could use Postman for example to test it. The url to use is located in the start:8000 tab of the CodeSandBox development tools.

If you detect any errors, especially with Crunchbase requests, it is interesting to include messages to the console in order to debug those errors.

The file manisfest.json

Now we only have to use the template of the manisfest.json file indicated by OpenAI to include our adaptations. I remind you of the best practices given by OpenAI for these descriptions. In our case, these would be:

{
    "schema_version": "v1",
    "name_for_human": "Crunchbase Plugin",
    "name_for_model": "crunchbase_plugin",
    "description_for_human": "Company information search provided by Crunchbase Basic API (name, description, website and image)",
    "description_for_model": "Plugin for company information search. You can search for companies by name and the information retrieved is their name, description, website and image.",
    "auth": {
        "type": "none"
    },
    "api": {
        "type": "openapi",
        "url": "/openai.json"
    },
    "logo_url": "https://diegoromero.es/wp-content/uploads/2021/10/cropped-network-icon-3.png",
    "contact_email": "example@example.com",
    "legal_info_url": "https://example.com/legal"
}

As you can see, the main thing is that it is correctly described so that the model is able to use the plugin (description_for_model). As for the logo, I have included the logo of my website so that it appears correctly in the ChatGPT interface and can be identified.

Done. Let’s check that everything works with ChatGPT.

Installing the plugin with ChatGPT

If everything went well, now it is only necessary to provide the URL of the virtual machine where our API is running in the Plugin Store of the ChatGPT interface. Here you have the sequence.

I leave you a link to the test conversation once installed where you can see that, transparently, it is also multi-language without configuring anything extra.

Next steps

As you have seen, the process of creating a plugin is simple. The complexity lies in the management of the information served by the data API and how it is presented to the ChatGPT interface so that it can be correctly interpreted and integrated into the conversation.

From here, you can expand the search and information retrieval capabilities not only of the companies but also of the financing rounds depending on the dates, investors, … if you have an Enterprise access to the Crunchbase API.

Final considerations

This exercise in exploration with plugins, automatic code generation and third-party APIs has been more than just a technical experiment; it has been a demonstration of how we can extend and enhance the capabilities of artificial intelligence tools such as ChatGPT. With the help of GPT4, I have been able to connect data effectively and easily, taking ChatGPT’s capabilities to a new level.

But what makes this exercise even more exciting is its relevance and applicability in the real world. These enhanced capabilities are not limited to ChatGPT; they also extend to BingChat and the rest of the Microsoft ecosystem as they will now be compatible across all platforms. This means that the possibilities of use are enormous, covering a wide range of applications and contexts.

Microsoft Build Keynote

In addition, the integration of libraries such as LangChain or LlamaIndex adds another layer of functionality and flexibility. These libraries can manage plugins, making them valuable tools for agent development. This opens up a new world of possibilities, allowing the creation of more sophisticated and customized solutions.

From a personal perspective, this exercise has been a true eye-opener. It has shown me how a tool like ChatGPT/GPT4 can boost my productivity and capacity for innovation in ways I had never imagined before. It has allowed me to explore topics that, until now, seemed out of my reach, such as APIs and deployment environments. And it has done so in a direct and accessible way, eliminating the barriers that used to exist. With the help of ChatGPT/GPT4, I have been able to dive into areas that I previously found intimidating and have discovered that I can not only understand them, but also use them to innovate and create.

But what is even more impressive is that all of this has been made possible by the capabilities of Large Language Models (LLMs). These models are not simply tools; they are personal assistants and coaches who can guide us through complex issues and help us understand and use advanced technologies.

In summary, this exercise has been more than a technical demonstration; it has been a learning and personal growth experience. It has shown me how artificial intelligence tools can help us expand our skills and capabilities, and I am excited to further explore the possibilities they offer.

powered by Crunchbase

Crunchbase BASIC APIs has been central to this exercise. I have been using this database in my work for a long time. Making available to everyone some basic data on all the companies they follow for Market & Research Intelligence is to be appreciated.