Preprocessing Unstructured Data for LLM Applications
Enhancing a RAG system’s performance depends on efficiently processing diverse unstructured data sources. In this course, you’ll learn techniques for...
By Matthew Robinson on Coursera
About This Course
Enhancing a RAG system’s performance depends on efficiently processing diverse unstructured data sources. In this course, you’ll learn techniques for representing all sorts of unstructured data, like text, images, and tables, from many different sources and implement them to extend your LLM RAG pipeline to include Excel, Word, PowerPoint, PDF, and EPUB files. 1. How to preprocess data for your LLM application development, focusing on how to work with different document types. 2. How to extract and normalize various documents into a common JSON format and enrich it with metadata to improve search results. 3. Techniques for document image analysis, including layout detection and vision transformers, to extract and understand PDFs, images, and tables. 4. How to build a RAG bot that is able to ingest different documents like PDFs, PowerPoints, and Markdown files. Apply the skills you’ll learn in this course to real-world scenarios, enhancing your RAG application and expanding its versatility.
Topics Covered
Frequently Asked Questions
How much does Preprocessing Unstructured Data for LLM Applications cost?
Visit the Preprocessing Unstructured Data for LLM Applications course page for current pricing and available discounts.
Who teaches Preprocessing Unstructured Data for LLM Applications?
Preprocessing Unstructured Data for LLM Applications is taught by Matthew Robinson, DeepLearning.AI.
What skill level is Preprocessing Unstructured Data for LLM Applications for?
This course is designed for all levels learners.
Similar Courses
HTML & CSS Coding for Beginners: Build your own portfolio!
Chris Dixon
Maya for Beginners: Animation
Lucas Ridley
JavaScript for Beginners (includes 6+ real life projects)
Kalob Taulien
Beginner Bootstrap 4: Hand code beautiful responsive websites fast
Chris Dixon