Skip to main content

Getting Started with Google Gemini Pro API

In this tutorial, we are going to explore the newest and the most advanced language model released by Google, Gemini Pro API. The best part? It is entirely free for use , worry no more on the costs of using it. 

Gemini Pro API
Gemini Pro API

Getting Started with Gemini Pro API

After you enter Google AI Studio, you can train your models and check the interface, which is almost the same as in Colab. Gemini Pro is however currently only available in Google’s AI Studio. On the right-hand side, you can see two models: These are the two models of the Gemini called Gemini Pro and Vision. If you decide to go with Gemini Pro, then you can upload a picture, and then select the pro one, which is for textual cues. You can also control the temperature and include stop sequence and safety measures as well.

Safety Settings

I most appreciate the safety options of Google since I am using it every day. As a developer, you can control these things like harassment, with three options: It can either fully block, partially block or hardly block or block nothing at all. You can also select the output length and top K and top P under the advanced options.

Install the Gemini API library

Using Python 3.9+, install the google-generativeai package using the following pip command:

pip install -q -U google-generativeai

Make your first request

To make your first request to Gemini API, you first need to get an API Key from the Google AI Studio.

This can be done by clicking on the “Get API Key” button on the left navigation menu.

Click on Get API Key
Click on Get API Key

Then click on “Create API Key” and select the project for which you want to create the API Key. If you are not sure about how projects work, visit Getting Started With Google Gemini.

Create API Key
Create API Key

Use the “generateContent” method to send a request to the Gemini API.

import google.generativeai as genai

genai.configure(api_key=”YOUR_API_KEY”)
model = genai.GenerativeModel(“gemini-1.5-flash”)
response = model.generate_content(“Explain how AI works”)
print(response.text)

Testing Gemini Pro

Let me provide it with a test input and see what it does. I’ll say, 

Explain generative AI to a fifth-grade student.

The generation is quite fast and I have to say, I am quite amazed with Google’s Gemini model.

 It says,

 “Generative AI is like a magic helper that can create new things from nothing, like pictures, stories, and even music. It’s super cool and fun to use.

Gemini Pro Vision Model

Moving to the next model, the Gemini Pro Vision model, the inputs that can be given are both text and images. It is a generative model and this API is to produce content and it accepts multimodal prompt and returns text. I will load an image and then use the generate content function and pass the image as the input. If I just upload the image, it will describe to me what the image is all about.

The Gemini API can run inference on images and videos passed to it. When passed an image, a series of images, or a video, Gemini can:

  • Describe or answer questions about the content
  • Summarize the content
  • Extrapolate from the content

Using Gemini Pro Vision with a Prompt

Use the “media.upload” method of the File API to upload an image of any size.

After uploading the file, you can make GenerateContent requests that reference the File API URI. Select the generative model and provide it with a text prompt and the uploaded image.

myfile = genai.upload_file(media / “Cajun_instruments.jpg”)
print(f”{myfile=}”)

model = genai.GenerativeModel(“gemini-1.5-flash”)
result = model.generate_content(
[myfile, “\n\n”, “Can you tell me about the instruments in this photo?”]
)
print(f”{result.text=}”)
Cajun Instruments
Cajun Instruments

Verify image file upload and get metadata

You can verify the API successfully stored the uploaded file and get its metadata by calling “files.get” method. Only the name (and by extension, the uri) are unique.

myfile = genai.upload_file(media / “poem.txt”)
file_name = myfile.name
print(file_name) # “files/*”

myfile = genai.get_file(file_name)
print(myfile)

Prompt with multiple images

You can provide the Gemini API with any combination of images and text that fit within the model’s context window. This example provides one short text prompt and the three images previously uploaded.

# Choose a Gemini model.
model = genai.GenerativeModel(model_name=”gemini-1.5-pro”)
prompt = “Write an advertising jingle showing how the product in the first image could solve the problems shown in the second two images.”response = model.generate_content([prompt, sample_file, sample_file_2, sample_file_3])

Markdown(“>” + response.text)

Error Resolution

Another great feature that Google also introduced is the error correction. To explain the error, Collab AI is at your service. It can in fact assist in the correction of the errors and this is such a good and great feature that has been developed by Google.

 

0
    0
    Your Cart
    Your cart is emptyReturn to Courses