Generative AI Server

What is Generative AI?

The AI sector is rapidly growing, with expectations of annual growth rate of 37.3% until 2030. This growth is mainly driven by the advancements and the adoption of Generative AI. Generative AI (GenAI) is a technology that uses artificial intelligence (AI) and deep learning to generate content, such as text, images or videos, based on user input, known as prompts.

Important elements are large language models (LLMs), which are based on neural networks and are trained with text data to perform tasks such as answering questions, summarizing documents and translating. Some of the well-known examples of LLMs include Open AI’s Chat GPT and Google’s Bard, which have significantly contributed to the rapid growth of Generative AI.

In contrast to common cloud-based solutions, an On-Premise solution runs pre-trained base models directly on a company’s local LLM server (GenAI Server). This approach allows organisations to train models on their own data, increasing productivity while protecting sensitive data.

Benefits of our On-Premise GenAI servers

  • Increased data security & control

    On-premise for high security and data protection. Secure your proprietary models and protect sensitive data.

  • Domain-specific model tuning

    Create models that are tailored to your specific needs and your company.

  • NVIDIA AI Enterprise ready hardware

    Our servers have been tested to ensure compatibility with NVIDIA® AI Enterprise and run NVIDIA® NIMs for a wide range of models.

Choose a server that suits your use case

A GenAI Server is a versatile platform that can host different types of generative AI models, starting with Large Language Models (LLMs). For companies looking to benefit from generative AI, it is critical to equip themselves with the right technology to unlock the full potential of these advanced models.

Our GenAI servers are designed to fulfill different requirements for various use cases.
From Entry-Level to Plus and Pro to Ultimate, each level supports a variety of models and types, from the use of pre-trained models to intensive LLM training. The systems have AI acceleration functions, various GPU options and extensive storage options.

Entry

Plus

Pro

Ultimate

  • Compact, active cooling

  • 1x NVIDIA® L4 with 24 GB GPU RAM

  • Edge GenAI Inference: pre-trained models

  • Model size up to 8B parameters

  • Compact, passive cooling

  • NVIDIA® Jetson AGX Orin™ with 64 GB GPU RAM

  • Edge GenAI Inference: pre-trained models, RAG

  • Model size up to 70B parameters

  • 19″ rack, active cooling

  • Up to 3x NVIDIA® RTX 6000 Ada with 48 GB GPU RAM each

  • Edge GenAI Inference: pre-trained models, RAG
    Edge AI Model Finetuning

  • Model size up to 70B parameters

  • 19″ rack, liquid cooling

  • Up to 2x NVIDIA® H100 NVL with 188 GB RAM each

  • Edge GenAI Inference: pre-trained models, RAG
    Edge AI Model Finetuning
    extensive training

  • Model size over 100B parameters

Use Case: Using GenAI to optimize internal knowledge

Your model, your data, your way

We have developed an open-source toolkit to test the performance of our hardware platforms. Choose your preferred LLM engine and model and bring your knowledge base into the models to create chatbots that are tailored to your needs.

Eurotech is already using this technology to support its sales staff with a knowledge-based co-pilot with access to the internal knowledge database. The solution leverages NVIDIA® AI Enterprise, the NVIDIA® Inference Microservices (NIM) for LLMs and open source frameworks such as llama-index. To ensure that you can maximize the full potential of your projects, we provide you with the toolkit here on GitHub. Check out the application in our demo video or read more here.

Chat with your data to boost productivity

The use of GenAI offers numerous opportunities across diverse industries, particularly in industrial automation, energy and medical. Simple interaction with a knowledge database improves collaboration between man, machine and knowledge databases and thus increases productivity and efficiency in research and development.
There are various use cases for generative AI, from intelligent chatbots to knowledge-based AI copilots that can be further trained with databases.

Applications of GenAI

Intelligent chatbot

Focus on question-and-answer tasks

e.g. customer service employee, helpdesk

Knowledge-based copilot

An AI copilot establishes a connection to knowledge databases and performs context-related tasks such as creating, writing, coding and analysing content.

e.g. documentation co-pilot, assistant for IT

Visual agent

The visual agent is able to process multimedia (images and videos) with vision-language models (VLM). This allows insights from videos to be summarised, searched and extracted using natural language.

e.g. NVIDIAs Vision Insights Agent (VIA) for visual problem detection, inspection and signaling

Get free consulting

Are you ready to use GenAI?
Talk to an expert to find the right hardware configuration for your application.

    Form of address:

    I agree that my submitted informations from the contact form will be collected, stored and processed to answer my request.

    FAQ

    NVIDIA® Inference Microservices (NVIDIA® NIMs) is a framework for improving the efficiency and performance of AI models. It ensures that AI applications run optimally on the hardware and increases the speed and reliability of real-time AI tasks.

    Benefit from NVIDIA® Inference Microservices (NVIDIA® NIMs) for LLMs powered by NVIDIA® AI Enterprise, which is used by an experienced AI community. This approach simplifies and accelerates the deployment and training of customised, complex LLMs. NVIDIA® NIM ensures optimized inference with more than two dozen popular AI models from NVIDIA® and its partners.

    With the help of its Generative AI Server, Eurotech has developed a co-pilot with access to the internal knowledge database to support its sales staff. Using Retrieval Augmented Generation (RAG), the chatbot was expanded to include Eurotech product data sheets and product manuals. The knowledge database has been expanded to include internal, domain-specific content such as design documents due to its secure use within the company. The young sales engineers now have an intelligent tutor at their side to support them with onboarding or answering customer queries. The solution uses NVIDIA® AI Enterprise, NVIDIA® NIM for LLMs and open source frameworks such as llama-index and is available on GitHub.

    LLMs are sophisticated models based on the principles of artificial intelligence and machine learning. These models are specially trained to analyze and understand large amounts of text data in order to generate texts that are similar to human language.

    Read more

    If you want to use an LLM model that has information from your own data, there are three options:

    1. Create your own LLM model from scratch and train it with the desired data
    2. Fine-tune a pre-trained LLM model with your own data
    3. Building a RAG system
    © InoNet Computer GmbH. Alle Rechte vorbehalten.