Introduction
Are you drowning in an ocean of information? In today's fast-paced world, staying updated and informed is essential for both personal and professional growth. However, the sheer volume of content available can be overwhelming. Document summarization applications offer a powerful solution by condensing information into concise summaries, capturing the key points from lengthy training videos, educational lectures, promotional content, research presentations, and podcasts. This greatly enhances your ability to keep up.
In this article, we will explore the various use cases of a Document Summarization (DocSum) application implemented using the Open Platform for Enterprise AI (OPEA™). Discover how these innovative tools can enhance productivity and efficiency across different domains. We will walk you through the steps to deploy and test drive OPEA’s DocSum application on the Intel® Gaudi® 2 AI accelerator using Intel® Tiber™ AI Cloud. From setup to execution, we’ll cover everything you need to know to unlock the potential of document summarization and transform the way you interact with information using this cutting-edge GenAI application.
What is OPEA?
OPEA is an open platform consisting of composable building blocks for state-of-the-art generative AI systems. It is ideal for showcasing DocSum because it is flexible and cost-effective. OPEA makes it easy to integrate advanced AI solutions into business systems, speeding up development and adding value. It uses a modular approach with microservices for flexibility and megaservices for comprehensive solutions, simplifying the development and scaling of complex AI applications. OPEA also supports powerful hardware like Intel Gaudi 2 and Intel® Xeon® Scalable Processors, which are adept at handling the heavy demands of AI models. Plus, OPEA’s GenAIExamples repository demonstrates many different scenarios and makes accessing the services easy and user-friendly.
Overview of the DocSum Application
The DocSum application leverages advanced open-source Large Language Models (LLMs) to revolutionize the way we interact with text. These models can be used to create summaries of various types of documents.
With DocSum, you can efficiently generate concise and accurate summaries, enhancing your ability to process and understand large volumes of information. The application supports summarization from various sources, including:
- Plain text
- Documents (.txt, .doc, .docx, .pdf)
- Audio (.wav)
- Video (.mp4)
At its core, DocSum consists of three key components:
-
User Interface Service: Provides an intuitive and user-friendly interface for interacting with the system. We provide two interface options — a Graphical User Interface (GUI) and a REST API.
-
Domain Transform Service: According to the document, DocSum employs a variety of domain transformation tools to extract plain text, which is then prepared for processing by the LLM.
-
LLM Service: Creates a summary of the document. This microservice leverages LangChain to implement summarization strategies and facilitate LLM inference using text generation inference. Based on the length of the context, the summary type can be selected, such as auto, stuff, truncate, map_reduce, and refine. Details of the LLM service can be found here.
This architecture ensures that DocSum can handle a wide range of domains from GUI and terminal.

Prerequisites
Before starting the setup of the DocSum application, make sure you have the following prerequisites:
-
Hardware: Access to a machine equipped with two or more Intel Gaudi 2 processor cards is required. For this tutorial, we will utilize Intel Tiber AI Cloud, specifically an instance featuring 8 Gaudi 2 HL-225H mezzanine cards with 3rd Generation Xeon processors, 1 TB of RAM, and 20 TB of disk space. If you lack SSH access to the machine, you will need to port forward the UI port (5173) to access the user interface.
-
Docker Compose: Docker Compose will be employed to run the services. Ensure that Docker Compose is installed on your machine.
Once these prerequisites are met, you can proceed with the step-by-step tutorial to set up and deploy the DocSum application on Intel Gaudi 2 using Intel Tiber AI Cloud.
Step-by-Step Tutorial
Follow these steps to get the DocSum application and its megaservice, which organizes the corresponding services, up and running on Intel Gaudi 2 using Tiber Cloud, and start summarizing your own data.
Step 1: Connect to Your Gaudi Machine
To begin, if you are using Tiber Cloud, initiate a Gaudi 2 instance and wait until it reaches the “Ready” state. Once the instance is ready, connect to the virtual machine (VM) via SSH with port forwarding to access the user interface on port 5173. Use the following command:
Step 2: Clone the GenAIExamples Repository
Next, download the GenAIExamples repository. This repository contains the necessary files for the DocSum application. Clone the repository and navigate to the DocSum Application Intel Gaudi directory. The docker compose directory includes options for running with different hardware configurations.
Step 3: Configure the Environment
Proceed to configure the environment by setting the necessary environment variables for the host IP (external public IP) and your Hugging Face token. Execute the set_env.sh script for Gaudi. If your enterprise operates behind a proxy, OPEA components will require specific proxy environment variables. Many parameters can be customized by editing the set_env.sh script before sourcing it. For instance, on Gaudi, the LLM model defaults to Intel/neural-chat-7b-v3-3.
To set up environment variables follow these steps:
Some Hugging Face models are gated and require a token, along with accepting Hugging Face's terms of use, as shown below:
The set_env.sh script also specifies the port numbers for the services, which can be modified if necessary. After reviewing and editing the entries in the set_env.sh script, source it in your environment:
Step 4: Start the Services with Docker Compose
To launch the services, use Docker Compose. The Docker Compose YAML configuration file defines and manages the multi-container DocSum application, handling networks, variables, ports, and other dependencies. Ensure you are in the GenAIExamples/DocSum/docker_compose directory, then execute the following command to start the services:
Step 5: Monitor the Service Initialization
Docker Compose will pull images from DockerHub and start the containers according to the configuration file. Some services, such as llm-docsum-server, may take several minutes to initialize, depending on your environment. You can monitor the readiness of the services before testing by following the container logs. Here’s how:
-
Get the list of container names.
Expected container names for DocSum:
docsum-xeon-ui-server
docsum-xeon-ui-server
docsum-xeon-backend-server
llm-docsum-server
whisper-server
tgi-server -
Check the service’s status of the desired container.
Step 6: Access the User Interface (UI)
The UI for DocSum makes it easy to interact with the system. We provide two interface options:
-
GUI: Included with the DocSum application example and deployed in a Docker container. It offers a user-friendly way to interact with the system by allowing users to input text, upload documents, and view summary responses. From your browser, navigate to http://127.0.0.1:5173 and access the DocSum application UI.
Figure 2. Creating a Summary of Text
Users can paste the text to be summarized into the text box. By clicking the "Generate Summary" button, the summarization of the text will start. Additionally, users can upload files from their local device. Once a file is uploaded, the summarization of the document will start automatically. A condensed summary of the content will be produced and displayed in the "Summary" box on the right. -
REST API: All functionalities provided by the GUI can also be accessed via curl commands. This flexibility allows for a programmatic interface, enabling the integration of the DocSum application with other applications. Here are some examples:
English mode (default)Chinese mode
Upload file
Audio and video file uploads are not supported in DocSum with a curl request in OPEA 1.2. Please use the GUI for these types of uploads. However, you can still pass a base64 string of the audio or video file as follows:
Step 7: Shutdown
Once you have completed using the DocSum application service, it's important to stop all running containers to release system resources. Navigate to the directory where your Docker Compose YAML file is located and execute the following command:
Conclusion
In an era overwhelmed by information, the ability to quickly and effectively summarize content is invaluable. The DocSum application, implemented using OPEA, offers a powerful solution to this challenge. By leveraging state-of-the-art generative AI systems and robust hardware like the Intel® Gaudi® 2 AI accelerator, DocSum transforms the way we interact with information, making it more accessible and manageable.
Throughout this article, we have explored the potential of document summarization across various domains. The flexibility and cost-effectiveness of OPEA make it an ideal platform for deploying advanced AI solutions, enhancing productivity and efficiency.
We encourage you to experiment with the DocSum application, integrate it into your workflows, and experience firsthand the benefits it brings. Your feedback is invaluable, so please share your experiences and suggestions through our GitHub repository.
Acknowledgements
Thank you to our colleagues who made contributions and helped to review this blog: Melanie Hart Buehler, Dina Suehiro Jones, Harsha Ramayanam, and Abolfazl Shahbazi.