News

Wiki

Contact

Generative AI Server

What is Generative AI?

The AI sector is rapidly growing, with expectations of annual growth rate of 37.3% until 2030. This growth is mainly driven by the advancements and the adoption of Generative AI. Generative AI (GenAI) is a technology that uses artificial intelligence (AI) and deep learning to generate content, such as text, images or videos, based on user input, known as prompts.

Important elements are large language models (LLMs), which are based on neural networks and are trained with text data to perform tasks such as answering questions, summarizing documents and translating. Some of the well-known examples of LLMs include Open AI’s Chat GPT and Google’s Bard, which have significantly contributed to the rapid growth of Generative AI.

In contrast to common cloud-based solutions, an On-Premise solution runs pre-trained base models directly on a company’s local LLM server (GenAI Server). This approach allows organisations to train models on their own data, increasing productivity while protecting sensitive data.

Request suitable hardware configuration

Benefits of our On-Premise GenAI servers

Increased data security & control
On-premise for high security and data protection. Secure your proprietary models and protect sensitive data.
Domain-specific model tuning
Create models that are tailored to your specific needs and your company.
NVIDIA AI Enterprise ready hardware
Our servers have been tested to ensure compatibility with NVIDIA® AI Enterprise and run NVIDIA® NIMs for a wide range of models.

nvidia-preferred-partner-badge-rgb-transparent-for-screen

Choose a server that suits your use case

A GenAI Server is a versatile platform that can host different types of generative AI models, starting with Large Language Models (LLMs). For companies looking to benefit from generative AI, it is critical to equip themselves with the right technology to unlock the full potential of these advanced models.

Our GenAI servers are designed to fulfill different requirements for various use cases.
From Entry-Level to Plus and Pro to Ultimate, each level supports a variety of models and types, from the use of pre-trained models to intensive LLM training. The systems have AI acceleration functions, various GPU options and extensive storage options.

Entry

Plus

Pro

Ultimate

Compact, active cooling
1x NVIDIA® L4 with 24 GB GPU RAM
Edge GenAI Inference: pre-trained models
Model size up to 8B parameters

Compact, passive cooling
NVIDIA® Jetson AGX Orin™ with 64 GB GPU RAM
Edge GenAI Inference: pre-trained models, RAG
Model size up to 70B parameters

19″ rack, active cooling
Up to 3x NVIDIA® RTX 6000 Ada with 48 GB GPU RAM each
Edge GenAI Inference: pre-trained models, RAG
Edge AI Model Finetuning
Model size up to 70B parameters

19″ rack, liquid cooling
Up to 2x NVIDIA® H100 NVL with 188 GB RAM each
Edge GenAI Inference: pre-trained models, RAG
Edge AI Model Finetuning
Extensive training
Model size over 100B parameters

High-End Line

Add to wishlist i

Jetzt Produkt der Wunschliste hinzufügen und jederzeit im "Mein-Konto-Bereich" unter dem Punkt "Gespeicherte Produkte und Warenkörbe" darauf zugreifen.

Concepion®-LLM-v3-L07-R680E

Compact GenAI system for AI inference with Intel® Core™ i CPUs up to 14th Generation and NVIDIA® L4 Tensor Core GPU

High-Performance with Intel^® Core™ i CPUs 12th/13th/14th Generation
Including powerful NVIDIA^® L4 Tensor Core GPU for AI applications
Robust full industrial components for 24/7 operation
Two additional 2.5" shuttles
Free Copilot (open-source ChatBot)
GenAI ready

To product page

Preis auf Anfrage

High-End Line

Add to wishlist i

Jetzt Produkt der Wunschliste hinzufügen und jederzeit im "Mein-Konto-Bereich" unter dem Punkt "Gespeicherte Produkte und Warenkörbe" darauf zugreifen.

Eurotech ReliaCOR 33-11

The fanless Embedded Edge-AI-System, based on the NVIDIA® Jetson AGX Orin™, is designed for challenging AI applications

NVIDIA^® Jetson AGX Orin™, up to 275 TOPS
Best in Class Cybersecurity
Camera Support
Certified Radios
Road Vehicle Certified, E-Mark
Compact and fanless
Easy to Program
Cloud Certified
GenAI ready

To product page

Preis auf Anfrage

High-End Line

Add to wishlist i

Jetzt Produkt der Wunschliste hinzufügen und jederzeit im "Mein-Konto-Bereich" unter dem Punkt "Gespeicherte Produkte und Warenkörbe" darauf zugreifen.

Mayflower®-LLM18-SP3

19“ GenAI server for AI inference and model fine-tuning with up to 3x NVIDIA® RTX 6000 Ada GPUs

High-Performance with AMD^® EPYC™ CPU
Including 2x NVIDIA^® RTX 6000 powerful GPUs (3x on request)
2x 100 GBit LAN (QSFP28) NVIDIA^® ConnectX^®-6 Dx
Robust full-industrial components for 24/7 operation
Free Copilot (open-source ChatBot)
GenAI ready

To product page

Preis auf Anfrage

High-End Line

Add to wishlist i

Jetzt Produkt der Wunschliste hinzufügen und jederzeit im "Mein-Konto-Bereich" unter dem Punkt "Gespeicherte Produkte und Warenkörbe" darauf zugreifen.

Mayflower®-LLMW33-SP3

19“ liquid-cooled GenAI server for AI inference, model fine-tuning & training with NVIDIA® H100 NVL GPU

High-Performance with AMD^® EPYC™ CPU
Including NVIDIA^® H100 NVL powerful GPU (2x on request)
Liquid cooling for CPU and GPU
2x 100 GBit LAN (QSFP28) NVIDIA^® ConnectX^®-6 Dx
Robust full-industrial components for 24/7 operation
Free Copilot (open-source ChatBot)
GenAI ready

To product page

Preis auf Anfrage

Use Case: Using GenAI to optimize internal knowledge

Your model, your data, your way

We have developed an open-source toolkit to test the performance of our hardware platforms. Choose your preferred LLM engine and model and bring your knowledge base into the models to create chatbots that are tailored to your needs.

Eurotech is already using this technology to support its sales staff with a knowledge-based co-pilot with access to the internal knowledge database. The solution leverages NVIDIA® AI Enterprise, the NVIDIA® Inference Microservices (NIM) for LLMs and open source frameworks such as llama-index. To ensure that you can maximize the full potential of your projects, we provide you with the toolkit here on GitHub. Check out the application in our demo video or read more here.

Chat with your data to boost productivity

The use of GenAI offers numerous opportunities across diverse industries, particularly in industrial automation, energy and medical. Simple interaction with a knowledge database improves collaboration between man, machine and knowledge databases and thus increases productivity and efficiency in research and development.

There are various use cases for generative AI, from intelligent chatbots to knowledge-based AI copilots that can be further trained with databases.

Applications of GenAI

Intelligent chatbot

Focus on question-and-answer tasks

e.g. customer service employee, helpdesk

Knowledge-based copilot

An AI copilot establishes a connection to knowledge databases and performs context-related tasks such as creating, writing, coding and analysing content.

e.g. documentation co-pilot, assistant for IT

Visual agent

The visual agent is able to process multimedia (images and videos) with vision-language models (VLM). This allows insights from videos to be summarised, searched and extracted using natural language.

e.g. NVIDIAs Vision Insights Agent (VIA) for visual problem detection, inspection and signaling

Get free consulting

Are you ready to use GenAI?
Talk to an expert to find the right hardware configuration for your application.

FAQ

What is NVIDIA NIM?

NVIDIA® Inference Microservices (NVIDIA® NIMs) is a framework for improving the efficiency and performance of AI models. It ensures that AI applications run optimally on the hardware and increases the speed and reliability of real-time AI tasks.

Benefit from NVIDIA® Inference Microservices (NVIDIA® NIMs) for LLMs powered by NVIDIA® AI Enterprise, which is used by an experienced AI community. This approach simplifies and accelerates the deployment and training of customised, complex LLMs. NVIDIA® NIM ensures optimized inference with more than two dozen popular AI models from NVIDIA® and its partners.

How does a knowledge-based copilot with NVIDIA® NIM work?

With the help of its Generative AI Server, Eurotech has developed a co-pilot with access to the internal knowledge database to support its sales staff. Using Retrieval Augmented Generation (RAG), the chatbot was expanded to include Eurotech product data sheets and product manuals. The knowledge database has been expanded to include internal, domain-specific content such as design documents due to its secure use within the company. The young sales engineers now have an intelligent tutor at their side to support them with onboarding or answering customer queries. The solution uses NVIDIA® AI Enterprise, NVIDIA® NIM for LLMs and open source frameworks such as llama-index and is available on GitHub.

What are LLMs?

LLMs are sophisticated models based on the principles of artificial intelligence and machine learning. These models are specially trained to analyze and understand large amounts of text data in order to generate texts that are similar to human language.

What are the options for training the models?

If you want to use an LLM model that has information from your own data, there are three options:

Create your own LLM model from scratch and train it with the desired data
Fine-tune a pre-trained LLM model with your own data
Building a RAG system

InoNet Computer GmbH
Wettersteinstraße 18
82024 Taufkirchen
+49 (0)89 666 096 0
info@inonet.com

Generative AI Server

What is Generative AI?

Benefits of our On-Premise GenAI servers

Increased data security & control

Domain-specific model tuning

NVIDIA AI Enterprise ready hardware

Choose a server that suits your use case

Entry

Plus

Pro

Ultimate

Concepion®-LLM-v3-L07-R680E

Eurotech ReliaCOR 33-11

Mayflower®-LLM18-SP3

Mayflower®-LLMW33-SP3

Use Case: Using GenAI to optimize internal knowledge

Chat with your data to boost productivity

Applications of GenAI

Get free consulting

FAQ

Über Uns

Informationen

Social Media