-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Pull requests: Unstructured-IO/unstructured
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: map long-s-t ligature (U+FB05) to 'st' not 'ft'
#4386
opened Jul 5, 2026 by
NishchayMahor
Loading…
feat(staging): render list bullets and title heading depth in markdown
element_to_md() function
#4384
opened Jul 1, 2026 by
aaronsteers
Loading…
feat: add max_page to chunk_by_title and remove multipage_sections
#4382
opened Jun 30, 2026 by
issahammoud
Loading…
5 tasks done
Detect Confluence-exported DOC files stored in FileContents stream
#4374
opened Jun 12, 2026 by
zanvari
Loading…
fix: preserve numeric text in table HTML metadata
#4373
opened Jun 12, 2026 by
gyx09212214-prog
Loading…
fix: derive crop box from coordinate extent in
save_elements
#4371
opened Jun 9, 2026 by
badGarnet
Collaborator
Loading…
feat: Implement PDF heading hierarchy inference for category_depth
#4369
opened Jun 9, 2026 by
ylcnymn
Loading…
fix: Support text partitioning from ZipExtFile objects
#4350
opened May 11, 2026 by
dsolankii
Loading…
fix: avoid false-positive Title classification for long no-space text
#4348
opened Apr 28, 2026 by
claytonlin1110
Contributor
Loading…
1 of 4 tasks
fix: prefer embedded PDF text over OCR for hi_res table tokens
#4347
opened Apr 28, 2026 by
claytonlin1110
Contributor
Loading…
fix(html): enable huge_tree on HTMLParser so deeply nested HTML partitions
#4340
opened Apr 16, 2026 by
CrepuscularIRIS
Loading…
3 tasks done
feat: add clean_newline utility for hyphenated line breaks (#2513)
#4339
opened Apr 16, 2026 by
DevAbdullah90
Loading…
fix: convert Tesseract language codes for PaddleOCR in OCRAgent.get_agent()
#4329
opened Apr 9, 2026 by
Mustafa-Shoukat1
Loading…
feat: add AG2 multi-agent document processing example
#4326
opened Apr 7, 2026 by
faridun-ag2
Loading…
7 tasks done
feat: infer hierarchical heading levels (H1-H6) for PDFs (#4204)
#4325
opened Apr 7, 2026 by
statxc
Loading…
2 tasks done
refactor: don't import unstructured-inference via partition.pdf
#4284
opened Mar 16, 2026 by
artdent
Loading…
fix: improve multi-column layout sorting for academic papers (#4104)
#4283
opened Mar 16, 2026 by
Gopesh111
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.