BLOG
Follow TextIn's latest updates to stay informed about the newest product developments. Text Intelligence has been focused on the field of intelligent document processing for 17 years, providing global users with the world's best document parsing service, which allows you to parse complex documents such as PDFs, imgaes and into structured data.
BLOG>Details
Customer Success Story: How TextIn's Document Parser Powers Data Infrastructure in FinTech
2024-11-01 15:18:34

In the rapidly evolving landscape of AI applications, success stories from early adopters provide valuable insights into real-world implementations. Today, we're excited to share how a leading financial technology company leveraged TextIn's document parsing capabilities to overcome critical data challenges.

Meet Company Z: Pioneering AI+SaaS in Capital Markets

Company Z stands at the forefront of capital market digitalization, providing AI-powered SaaS solutions to listed companies, financial institutions, and regulatory bodies. Their product suite includes:

  • Enterprise Platform: An integrated solution covering eight key areas including information disclosure, compliance trading, and shareholder analysis
  • Special Stock Management System: Helping securities firms manage stock trading compliance for major shareholders and executives
  • Enterprise Legal Database: A comprehensive compliance knowledge base that has gained significant market recognition

The Challenge: Awakening Data from PDFs

While building their data infrastructure, Company Z faced a significant challenge: extracting high-quality, structured data from various document types, particularly PDFs. Their use cases included:

  • Real-time announcements from listed companies and banks
  • Annual and semi-annual reports
  • Analysis reports requiring markdown annotations
  • Executive information embedded in complex tables

The technical team initially developed their own solution using pymupdf, but encountered several persistent challenges:

  1. Scanned Documents: Unable to process scanned PDFs effectively
  2. Character Encoding: Special fonts causing text to appear as gibberish
  3. Borderless Tables: Difficulty in detecting and parsing tables without visible borders

The Solution: TextIn's PDF Parser (TextIn ParseX)

After evaluating multiple solutions, Company Z chose TextIn's PDF Parser (TextIn ParseX) for its superior accuracy and comprehensive feature set. Here's how TextIn addressed their key challenges:

1. Borderless Table Recognition

2. Advanced OCR Capabilities

  • Accurate processing of scanned documents
  • Proper handling of special fonts and encodings
  • Conversion of image-based information into machine-readable formats

3. Flexible SDK Features

  • Selective extraction of tables, formulas, or handwritten content
  • Support for various output formats (JSON, Markdown)
  • Easy integration with existing systems

The Impact

By implementing TextIn's solution, Company Z achieved:

  • Higher data accuracy in their compliance monitoring systems
  • Faster processing of financial documents
  • More reliable extraction of executive information
  • Enhanced ability to train downstream AI models

Looking Ahead

TextIn continues to enhance its parsing capabilities based on user feedback:

  • Developing coordinate information for table cells
  • Improving nested and cross-page table recognition
  • Enhancing font format detection (bold, italic, different sizes)
  • Streamlining the user interface and API integration options

For developers and businesses interested in document parsing solutions, TextIn offers a free trial with 100-page processing credit for new users. Our team is committed to helping you explore how our technology can address your specific use cases.

Key Takeaways

  1. Document parsing remains a critical challenge in financial technology
  2. Borderless tables require sophisticated recognition algorithms
  3. A comprehensive solution must handle various document types and formats
  4. Accurate data extraction forms the foundation for downstream AI applications


[Note: This case study has been shared with permission from the client. Some details have been anonymized to protect confidential information.]

background
background
Free to Use
DEMO
chart