with '#' will be ignored, and an empty message aborts the commit. On branch main Your branch is up to date with 'origin/main'. Changes to be committed: new file: .claude/skills/algorithmic-art/.openskills.json new file: .claude/skills/algorithmic-art/LICENSE.txt new file: .claude/skills/algorithmic-art/SKILL.md new file: .claude/skills/algorithmic-art/templates/generator_template.js new file: .claude/skills/algorithmic-art/templates/viewer.html new file: .claude/skills/brand-guidelines/.openskills.json new file: .claude/skills/brand-guidelines/LICENSE.txt new file: .claude/skills/brand-guidelines/SKILL.md new file: .claude/skills/canvas-design/.openskills.json new file: .claude/skills/canvas-design/LICENSE.txt new file: .claude/skills/canvas-design/SKILL.md new file: .claude/skills/canvas-design/canvas-fonts/ArsenalSC-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/ArsenalSC-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/BigShoulders-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/BigShoulders-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/BigShoulders-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/Boldonse-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/Boldonse-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/BricolageGrotesque-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/BricolageGrotesque-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/BricolageGrotesque-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/CrimsonPro-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/CrimsonPro-Italic.ttf new file: .claude/skills/canvas-design/canvas-fonts/CrimsonPro-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/CrimsonPro-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/DMMono-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/DMMono-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/EricaOne-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/EricaOne-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/GeistMono-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/GeistMono-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/GeistMono-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/Gloock-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/Gloock-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/IBMPlexMono-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/IBMPlexMono-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/IBMPlexMono-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/IBMPlexSerif-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/IBMPlexSerif-BoldItalic.ttf new file: .claude/skills/canvas-design/canvas-fonts/IBMPlexSerif-Italic.ttf new file: .claude/skills/canvas-design/canvas-fonts/IBMPlexSerif-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/InstrumentSans-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/InstrumentSans-BoldItalic.ttf new file: .claude/skills/canvas-design/canvas-fonts/InstrumentSans-Italic.ttf new file: .claude/skills/canvas-design/canvas-fonts/InstrumentSans-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/InstrumentSans-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/InstrumentSerif-Italic.ttf new file: .claude/skills/canvas-design/canvas-fonts/InstrumentSerif-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/Italiana-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/Italiana-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/JetBrainsMono-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/JetBrainsMono-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/JetBrainsMono-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/Jura-Light.ttf new file: .claude/skills/canvas-design/canvas-fonts/Jura-Medium.ttf new file: .claude/skills/canvas-design/canvas-fonts/Jura-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/LibreBaskerville-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/LibreBaskerville-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/Lora-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/Lora-BoldItalic.ttf new file: .claude/skills/canvas-design/canvas-fonts/Lora-Italic.ttf new file: .claude/skills/canvas-design/canvas-fonts/Lora-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/Lora-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/NationalPark-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/NationalPark-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/NationalPark-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/NothingYouCouldDo-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/NothingYouCouldDo-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/Outfit-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/Outfit-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/Outfit-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/PixelifySans-Medium.ttf new file: .claude/skills/canvas-design/canvas-fonts/PixelifySans-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/PoiretOne-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/PoiretOne-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/RedHatMono-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/RedHatMono-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/RedHatMono-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/Silkscreen-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/Silkscreen-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/SmoochSans-Medium.ttf new file: .claude/skills/canvas-design/canvas-fonts/SmoochSans-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/Tektur-Medium.ttf new file: .claude/skills/canvas-design/canvas-fonts/Tektur-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/Tektur-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/WorkSans-Bold.ttf new file: .claude/skills/canvas-design/canvas-fonts/WorkSans-BoldItalic.ttf new file: .claude/skills/canvas-design/canvas-fonts/WorkSans-Italic.ttf new file: .claude/skills/canvas-design/canvas-fonts/WorkSans-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/WorkSans-Regular.ttf new file: .claude/skills/canvas-design/canvas-fonts/YoungSerif-OFL.txt new file: .claude/skills/canvas-design/canvas-fonts/YoungSerif-Regular.ttf new file: .claude/skills/doc-coauthoring/.openskills.json new file: .claude/skills/doc-coauthoring/SKILL.md new file: .claude/skills/docx/.openskills.json new file: .claude/skills/docx/LICENSE.txt new file: .claude/skills/docx/SKILL.md new file: .claude/skills/docx/scripts/__init__.py new file: .claude/skills/docx/scripts/accept_changes.py new file: .claude/skills/docx/scripts/comment.py new file: .claude/skills/docx/scripts/office/helpers/__init__.py new file: .claude/skills/docx/scripts/office/helpers/merge_runs.py new file: .claude/skills/docx/scripts/office/helpers/simplify_redlines.py new file: .claude/skills/docx/scripts/office/pack.py new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-chart.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-chartDrawing.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-diagram.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-lockedCanvas.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-main.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-picture.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-spreadsheetDrawing.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-wordprocessingDrawing.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/pml.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-additionalCharacteristics.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-bibliography.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-commonSimpleTypes.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-customXmlDataProperties.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-customXmlSchemaProperties.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesCustom.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesExtended.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesVariantTypes.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-math.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-relationshipReference.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/sml.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-main.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-officeDrawing.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-presentationDrawing.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-spreadsheetDrawing.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-wordprocessingDrawing.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/wml.xsd new file: .claude/skills/docx/scripts/office/schemas/ISO-IEC29500-4_2016/xml.xsd new file: .claude/skills/docx/scripts/office/schemas/ecma/fouth-edition/opc-contentTypes.xsd new file: .claude/skills/docx/scripts/office/schemas/ecma/fouth-edition/opc-coreProperties.xsd new file: .claude/skills/docx/scripts/office/schemas/ecma/fouth-edition/opc-digSig.xsd new file: .claude/skills/docx/scripts/office/schemas/ecma/fouth-edition/opc-relationships.xsd new file: .claude/skills/docx/scripts/office/schemas/mce/mc.xsd new file: .claude/skills/docx/scripts/office/schemas/microsoft/wml-2010.xsd new file: .claude/skills/docx/scripts/office/schemas/microsoft/wml-2012.xsd new file: .claude/skills/docx/scripts/office/schemas/microsoft/wml-2018.xsd new file: .claude/skills/docx/scripts/office/schemas/microsoft/wml-cex-2018.xsd new file: .claude/skills/docx/scripts/office/schemas/microsoft/wml-cid-2016.xsd new file: .claude/skills/docx/scripts/office/schemas/microsoft/wml-sdtdatahash-2020.xsd new file: .claude/skills/docx/scripts/office/schemas/microsoft/wml-symex-2015.xsd new file: .claude/skills/docx/scripts/office/soffice.py new file: .claude/skills/docx/scripts/office/unpack.py new file: .claude/skills/docx/scripts/office/validate.py new file: .claude/skills/docx/scripts/office/validators/__init__.py new file: .claude/skills/docx/scripts/office/validators/base.py new file: .claude/skills/docx/scripts/office/validators/docx.py new file: .claude/skills/docx/scripts/office/validators/pptx.py new file: .claude/skills/docx/scripts/office/validators/redlining.py new file: .claude/skills/docx/scripts/templates/comments.xml new file: .claude/skills/docx/scripts/templates/commentsExtended.xml new file: .claude/skills/docx/scripts/templates/commentsExtensible.xml new file: .claude/skills/docx/scripts/templates/commentsIds.xml new file: .claude/skills/docx/scripts/templates/people.xml new file: .claude/skills/frontend-design/.openskills.json new file: .claude/skills/frontend-design/LICENSE.txt new file: .claude/skills/frontend-design/SKILL.md new file: .claude/skills/internal-comms/.openskills.json new file: .claude/skills/internal-comms/LICENSE.txt new file: .claude/skills/internal-comms/SKILL.md new file: .claude/skills/internal-comms/examples/3p-updates.md new file: .claude/skills/internal-comms/examples/company-newsletter.md new file: .claude/skills/internal-comms/examples/faq-answers.md new file: .claude/skills/internal-comms/examples/general-comms.md new file: .claude/skills/mcp-builder/.openskills.json new file: .claude/skills/mcp-builder/LICENSE.txt new file: .claude/skills/mcp-builder/SKILL.md new file: .claude/skills/mcp-builder/reference/evaluation.md new file: .claude/skills/mcp-builder/reference/mcp_best_practices.md new file: .claude/skills/mcp-builder/reference/node_mcp_server.md new file: .claude/skills/mcp-builder/reference/python_mcp_server.md new file: .claude/skills/mcp-builder/scripts/connections.py new file: .claude/skills/mcp-builder/scripts/evaluation.py new file: .claude/skills/mcp-builder/scripts/example_evaluation.xml new file: .claude/skills/mcp-builder/scripts/requirements.txt new file: .claude/skills/pdf/.openskills.json new file: .claude/skills/pdf/LICENSE.txt new file: .claude/skills/pdf/SKILL.md new file: .claude/skills/pdf/forms.md new file: .claude/skills/pdf/reference.md new file: .claude/skills/pdf/scripts/check_bounding_boxes.py new file: .claude/skills/pdf/scripts/check_fillable_fields.py new file: .claude/skills/pdf/scripts/convert_pdf_to_images.py new file: .claude/skills/pdf/scripts/create_validation_image.py new file: .claude/skills/pdf/scripts/extract_form_field_info.py new file: .claude/skills/pdf/scripts/extract_form_structure.py new file: .claude/skills/pdf/scripts/fill_fillable_fields.py new file: .claude/skills/pdf/scripts/fill_pdf_form_with_annotations.py new file: .claude/skills/pptx/.openskills.json new file: .claude/skills/pptx/LICENSE.txt new file: .claude/skills/pptx/SKILL.md new file: .claude/skills/pptx/editing.md new file: .claude/skills/pptx/pptxgenjs.md new file: .claude/skills/pptx/scripts/__init__.py new file: .claude/skills/pptx/scripts/add_slide.py new file: .claude/skills/pptx/scripts/clean.py new file: .claude/skills/pptx/scripts/office/helpers/__init__.py new file: .claude/skills/pptx/scripts/office/helpers/merge_runs.py new file: .claude/skills/pptx/scripts/office/helpers/simplify_redlines.py new file: .claude/skills/pptx/scripts/office/pack.py new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-chart.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-chartDrawing.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-diagram.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-lockedCanvas.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-main.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-picture.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-spreadsheetDrawing.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-wordprocessingDrawing.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/pml.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-additionalCharacteristics.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-bibliography.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-commonSimpleTypes.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-customXmlDataProperties.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-customXmlSchemaProperties.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesCustom.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesExtended.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesVariantTypes.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-math.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-relationshipReference.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/sml.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-main.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-officeDrawing.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-presentationDrawing.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-spreadsheetDrawing.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-wordprocessingDrawing.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/wml.xsd new file: .claude/skills/pptx/scripts/office/schemas/ISO-IEC29500-4_2016/xml.xsd new file: .claude/skills/pptx/scripts/office/schemas/ecma/fouth-edition/opc-contentTypes.xsd new file: .claude/skills/pptx/scripts/office/schemas/ecma/fouth-edition/opc-coreProperties.xsd new file: .claude/skills/pptx/scripts/office/schemas/ecma/fouth-edition/opc-digSig.xsd new file: .claude/skills/pptx/scripts/office/schemas/ecma/fouth-edition/opc-relationships.xsd new file: .claude/skills/pptx/scripts/office/schemas/mce/mc.xsd new file: .claude/skills/pptx/scripts/office/schemas/microsoft/wml-2010.xsd new file: .claude/skills/pptx/scripts/office/schemas/microsoft/wml-2012.xsd new file: .claude/skills/pptx/scripts/office/schemas/microsoft/wml-2018.xsd new file: .claude/skills/pptx/scripts/office/schemas/microsoft/wml-cex-2018.xsd new file: .claude/skills/pptx/scripts/office/schemas/microsoft/wml-cid-2016.xsd new file: .claude/skills/pptx/scripts/office/schemas/microsoft/wml-sdtdatahash-2020.xsd new file: .claude/skills/pptx/scripts/office/schemas/microsoft/wml-symex-2015.xsd new file: .claude/skills/pptx/scripts/office/soffice.py new file: .claude/skills/pptx/scripts/office/unpack.py new file: .claude/skills/pptx/scripts/office/validate.py new file: .claude/skills/pptx/scripts/office/validators/__init__.py new file: .claude/skills/pptx/scripts/office/validators/base.py new file: .claude/skills/pptx/scripts/office/validators/docx.py new file: .claude/skills/pptx/scripts/office/validators/pptx.py new file: .claude/skills/pptx/scripts/office/validators/redlining.py new file: .claude/skills/pptx/scripts/thumbnail.py new file: .claude/skills/skill-creator/.openskills.json new file: .claude/skills/skill-creator/LICENSE.txt new file: .claude/skills/skill-creator/SKILL.md new file: .claude/skills/skill-creator/agents/analyzer.md new file: .claude/skills/skill-creator/agents/comparator.md new file: .claude/skills/skill-creator/agents/grader.md new file: .claude/skills/skill-creator/assets/eval_review.html new file: .claude/skills/skill-creator/eval-viewer/generate_review.py new file: .claude/skills/skill-creator/eval-viewer/viewer.html new file: .claude/skills/skill-creator/references/schemas.md new file: .claude/skills/skill-creator/scripts/__init__.py new file: .claude/skills/skill-creator/scripts/aggregate_benchmark.py new file: .claude/skills/skill-creator/scripts/generate_report.py new file: .claude/skills/skill-creator/scripts/improve_description.py new file: .claude/skills/skill-creator/scripts/package_skill.py new file: .claude/skills/skill-creator/scripts/quick_validate.py new file: .claude/skills/skill-creator/scripts/run_eval.py new file: .claude/skills/skill-creator/scripts/run_loop.py new file: .claude/skills/skill-creator/scripts/utils.py new file: .claude/skills/slack-gif-creator/.openskills.json new file: .claude/skills/slack-gif-creator/LICENSE.txt new file: .claude/skills/slack-gif-creator/SKILL.md new file: .claude/skills/slack-gif-creator/core/easing.py new file: .claude/skills/slack-gif-creator/core/frame_composer.py new file: .claude/skills/slack-gif-creator/core/gif_builder.py new file: .claude/skills/slack-gif-creator/core/validators.py new file: .claude/skills/slack-gif-creator/requirements.txt new file: .claude/skills/template/.openskills.json new file: .claude/skills/template/SKILL.md new file: .claude/skills/theme-factory/.openskills.json new file: .claude/skills/theme-factory/LICENSE.txt new file: .claude/skills/theme-factory/SKILL.md new file: .claude/skills/theme-factory/theme-showcase.pdf new file: .claude/skills/theme-factory/themes/arctic-frost.md new file: .claude/skills/theme-factory/themes/botanical-garden.md new file: .claude/skills/theme-factory/themes/desert-rose.md new file: .claude/skills/theme-factory/themes/forest-canopy.md new file: .claude/skills/theme-factory/themes/golden-hour.md new file: .claude/skills/theme-factory/themes/midnight-galaxy.md new file: .claude/skills/theme-factory/themes/modern-minimalist.md new file: .claude/skills/theme-factory/themes/ocean-depths.md new file: .claude/skills/theme-factory/themes/sunset-boulevard.md new file: .claude/skills/theme-factory/themes/tech-innovation.md new file: .claude/skills/web-artifacts-builder/.openskills.json new file: .claude/skills/web-artifacts-builder/LICENSE.txt new file: .claude/skills/web-artifacts-builder/SKILL.md new file: .claude/skills/web-artifacts-builder/scripts/bundle-artifact.sh new file: .claude/skills/web-artifacts-builder/scripts/init-artifact.sh new file: .claude/skills/web-artifacts-builder/scripts/shadcn-components.tar.gz new file: .claude/skills/webapp-testing/.openskills.json new file: .claude/skills/webapp-testing/LICENSE.txt new file: .claude/skills/webapp-testing/SKILL.md new file: .claude/skills/webapp-testing/examples/console_logging.py new file: .claude/skills/webapp-testing/examples/element_discovery.py new file: .claude/skills/webapp-testing/examples/static_html_automation.py new file: .claude/skills/webapp-testing/scripts/with_server.py new file: .claude/skills/xlsx/.openskills.json new file: .claude/skills/xlsx/LICENSE.txt new file: .claude/skills/xlsx/SKILL.md new file: .claude/skills/xlsx/scripts/office/helpers/__init__.py new file: .claude/skills/xlsx/scripts/office/helpers/merge_runs.py new file: .claude/skills/xlsx/scripts/office/helpers/simplify_redlines.py new file: .claude/skills/xlsx/scripts/office/pack.py new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-chart.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-chartDrawing.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-diagram.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-lockedCanvas.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-main.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-picture.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-spreadsheetDrawing.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/dml-wordprocessingDrawing.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/pml.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-additionalCharacteristics.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-bibliography.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-commonSimpleTypes.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-customXmlDataProperties.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-customXmlSchemaProperties.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesCustom.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesExtended.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesVariantTypes.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-math.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/shared-relationshipReference.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/sml.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-main.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-officeDrawing.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-presentationDrawing.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-spreadsheetDrawing.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/vml-wordprocessingDrawing.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/wml.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ISO-IEC29500-4_2016/xml.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ecma/fouth-edition/opc-contentTypes.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ecma/fouth-edition/opc-coreProperties.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ecma/fouth-edition/opc-digSig.xsd new file: .claude/skills/xlsx/scripts/office/schemas/ecma/fouth-edition/opc-relationships.xsd new file: .claude/skills/xlsx/scripts/office/schemas/mce/mc.xsd new file: .claude/skills/xlsx/scripts/office/schemas/microsoft/wml-2010.xsd new file: .claude/skills/xlsx/scripts/office/schemas/microsoft/wml-2012.xsd new file: .claude/skills/xlsx/scripts/office/schemas/microsoft/wml-2018.xsd new file: .claude/skills/xlsx/scripts/office/schemas/microsoft/wml-cex-2018.xsd new file: .claude/skills/xlsx/scripts/office/schemas/microsoft/wml-cid-2016.xsd new file: .claude/skills/xlsx/scripts/office/schemas/microsoft/wml-sdtdatahash-2020.xsd new file: .claude/skills/xlsx/scripts/office/schemas/microsoft/wml-symex-2015.xsd new file: .claude/skills/xlsx/scripts/office/soffice.py new file: .claude/skills/xlsx/scripts/office/unpack.py new file: .claude/skills/xlsx/scripts/office/validate.py new file: .claude/skills/xlsx/scripts/office/validators/__init__.py new file: .claude/skills/xlsx/scripts/office/validators/base.py new file: .claude/skills/xlsx/scripts/office/validators/docx.py new file: .claude/skills/xlsx/scripts/office/validators/pptx.py new file: .claude/skills/xlsx/scripts/office/validators/redlining.py new file: .claude/skills/xlsx/scripts/recalc.py new file: .env.example new file: .gitignore new file: config/mcp.json new file: config/models.json new file: config/personalities.json new file: docs/AGENTS.md new file: docs/AI_IMPLEMENTATION.md new file: docs/AI_INTEGRATION_COMPLETE.md new file: docs/AI_QUICKSTART.md new file: docs/AI_SUMMARY.md new file: docs/CHANGELOG.md new file: docs/CONFIG_GUIDE.md new file: docs/FIXES.md new file: docs/PROJECT_REFACTOR.md new file: docs/README.md new file: docs/README_INDEX.md new file: examples/ai_example.py new file: main.py new file: pytest.ini new file: requirements.txt new file: scripts/migrate_to_vector_db.py new file: skills/cmd_zip_skill/README.md new file: skills/cmd_zip_skill/__init__.py new file: skills/cmd_zip_skill/main.py new file: skills/cmd_zip_skill/skill.json new file: skills/cmd_zip_skill_1772465404375/README.md new file: skills/cmd_zip_skill_1772465404375/__init__.py new file: skills/cmd_zip_skill_1772465404375/main.py new file: skills/cmd_zip_skill_1772465404375/skill.json new file: skills/cmd_zip_skill_1772465434774/README.md new file: skills/cmd_zip_skill_1772465434774/__init__.py new file: skills/cmd_zip_skill_1772465434774/main.py new file: skills/cmd_zip_skill_1772465434774/skill.json new file: skills/cmd_zip_skill_1772465467809/README.md new file: skills/cmd_zip_skill_1772465467809/__init__.py new file: skills/cmd_zip_skill_1772465467809/main.py new file: skills/cmd_zip_skill_1772465467809/skill.json new file: skills/cmd_zip_skill_1772465652075/README.md new file: skills/cmd_zip_skill_1772465652075/__init__.py new file: skills/cmd_zip_skill_1772465652075/main.py new file: skills/cmd_zip_skill_1772465652075/skill.json new file: skills/cmd_zip_skill_1772465685352/README.md new file: skills/cmd_zip_skill_1772465685352/__init__.py new file: skills/cmd_zip_skill_1772465685352/main.py new file: skills/cmd_zip_skill_1772465685352/skill.json new file: skills/cmd_zip_skill_1772465936294/README.md new file: skills/cmd_zip_skill_1772465936294/__init__.py new file: skills/cmd_zip_skill_1772465936294/main.py new file: skills/cmd_zip_skill_1772465936294/skill.json new file: skills/cmd_zip_skill_1772465966322/README.md new file: skills/cmd_zip_skill_1772465966322/__init__.py new file: skills/cmd_zip_skill_1772465966322/main.py new file: skills/cmd_zip_skill_1772465966322/skill.json new file: skills/cmd_zip_skill_1772466071278/README.md new file: skills/cmd_zip_skill_1772466071278/__init__.py new file: skills/cmd_zip_skill_1772466071278/main.py new file: skills/cmd_zip_skill_1772466071278/skill.json new file: skills/skills_creator/README.md new file: skills/skills_creator/__init__.py new file: skills/skills_creator/main.py new file: skills/skills_creator/skill.json new file: src/__init__.py new file: src/ai/__init__.py new file: src/ai/base.py new file: src/ai/client.py new file: src/ai/docs/README.md new file: src/ai/mcp/__init__.py new file: src/ai/mcp/base.py new file: src/ai/mcp/servers/__init__.py new file: src/ai/mcp/servers/filesystem.py new file: src/ai/memory.py new file: src/ai/models/__init__.py new file: src/ai/models/anthropic_model.py new file: src/ai/models/openai_model.py new file: src/ai/personality.py new file: src/ai/skills/__init__.py new file: src/ai/skills/base.py new file: src/ai/task_manager.py new file: src/ai/vector_store/__init__.py new file: src/ai/vector_store/base.py new file: src/ai/vector_store/chroma_store.py new file: src/ai/vector_store/json_store.py new file: src/core/__init__.py new file: src/core/bot.py new file: src/core/config.py new file: src/handlers/__init__.py new file: src/handlers/message_handler.py new file: src/handlers/message_handler_ai.py new file: src/utils/__init__.py new file: src/utils/logger.py new file: start.bat new file: tests/test_ai.py
16 KiB
16 KiB
PDF Processing Advanced Reference
This document contains advanced PDF processing features, detailed examples, and additional libraries not covered in the main skill instructions.
pypdfium2 Library (Apache/BSD License)
Overview
pypdfium2 is a Python binding for PDFium (Chromium's PDF library). It's excellent for fast PDF rendering, image generation, and serves as a PyMuPDF replacement.
Render PDF to Images
import pypdfium2 as pdfium
from PIL import Image
# Load PDF
pdf = pdfium.PdfDocument("document.pdf")
# Render page to image
page = pdf[0] # First page
bitmap = page.render(
scale=2.0, # Higher resolution
rotation=0 # No rotation
)
# Convert to PIL Image
img = bitmap.to_pil()
img.save("page_1.png", "PNG")
# Process multiple pages
for i, page in enumerate(pdf):
bitmap = page.render(scale=1.5)
img = bitmap.to_pil()
img.save(f"page_{i+1}.jpg", "JPEG", quality=90)
Extract Text with pypdfium2
import pypdfium2 as pdfium
pdf = pdfium.PdfDocument("document.pdf")
for i, page in enumerate(pdf):
text = page.get_text()
print(f"Page {i+1} text length: {len(text)} chars")
JavaScript Libraries
pdf-lib (MIT License)
pdf-lib is a powerful JavaScript library for creating and modifying PDF documents in any JavaScript environment.
Load and Manipulate Existing PDF
import { PDFDocument } from 'pdf-lib';
import fs from 'fs';
async function manipulatePDF() {
// Load existing PDF
const existingPdfBytes = fs.readFileSync('input.pdf');
const pdfDoc = await PDFDocument.load(existingPdfBytes);
// Get page count
const pageCount = pdfDoc.getPageCount();
console.log(`Document has ${pageCount} pages`);
// Add new page
const newPage = pdfDoc.addPage([600, 400]);
newPage.drawText('Added by pdf-lib', {
x: 100,
y: 300,
size: 16
});
// Save modified PDF
const pdfBytes = await pdfDoc.save();
fs.writeFileSync('modified.pdf', pdfBytes);
}
Create Complex PDFs from Scratch
import { PDFDocument, rgb, StandardFonts } from 'pdf-lib';
import fs from 'fs';
async function createPDF() {
const pdfDoc = await PDFDocument.create();
// Add fonts
const helveticaFont = await pdfDoc.embedFont(StandardFonts.Helvetica);
const helveticaBold = await pdfDoc.embedFont(StandardFonts.HelveticaBold);
// Add page
const page = pdfDoc.addPage([595, 842]); // A4 size
const { width, height } = page.getSize();
// Add text with styling
page.drawText('Invoice #12345', {
x: 50,
y: height - 50,
size: 18,
font: helveticaBold,
color: rgb(0.2, 0.2, 0.8)
});
// Add rectangle (header background)
page.drawRectangle({
x: 40,
y: height - 100,
width: width - 80,
height: 30,
color: rgb(0.9, 0.9, 0.9)
});
// Add table-like content
const items = [
['Item', 'Qty', 'Price', 'Total'],
['Widget', '2', '$50', '$100'],
['Gadget', '1', '$75', '$75']
];
let yPos = height - 150;
items.forEach(row => {
let xPos = 50;
row.forEach(cell => {
page.drawText(cell, {
x: xPos,
y: yPos,
size: 12,
font: helveticaFont
});
xPos += 120;
});
yPos -= 25;
});
const pdfBytes = await pdfDoc.save();
fs.writeFileSync('created.pdf', pdfBytes);
}
Advanced Merge and Split Operations
import { PDFDocument } from 'pdf-lib';
import fs from 'fs';
async function mergePDFs() {
// Create new document
const mergedPdf = await PDFDocument.create();
// Load source PDFs
const pdf1Bytes = fs.readFileSync('doc1.pdf');
const pdf2Bytes = fs.readFileSync('doc2.pdf');
const pdf1 = await PDFDocument.load(pdf1Bytes);
const pdf2 = await PDFDocument.load(pdf2Bytes);
// Copy pages from first PDF
const pdf1Pages = await mergedPdf.copyPages(pdf1, pdf1.getPageIndices());
pdf1Pages.forEach(page => mergedPdf.addPage(page));
// Copy specific pages from second PDF (pages 0, 2, 4)
const pdf2Pages = await mergedPdf.copyPages(pdf2, [0, 2, 4]);
pdf2Pages.forEach(page => mergedPdf.addPage(page));
const mergedPdfBytes = await mergedPdf.save();
fs.writeFileSync('merged.pdf', mergedPdfBytes);
}
pdfjs-dist (Apache License)
PDF.js is Mozilla's JavaScript library for rendering PDFs in the browser.
Basic PDF Loading and Rendering
import * as pdfjsLib from 'pdfjs-dist';
// Configure worker (important for performance)
pdfjsLib.GlobalWorkerOptions.workerSrc = './pdf.worker.js';
async function renderPDF() {
// Load PDF
const loadingTask = pdfjsLib.getDocument('document.pdf');
const pdf = await loadingTask.promise;
console.log(`Loaded PDF with ${pdf.numPages} pages`);
// Get first page
const page = await pdf.getPage(1);
const viewport = page.getViewport({ scale: 1.5 });
// Render to canvas
const canvas = document.createElement('canvas');
const context = canvas.getContext('2d');
canvas.height = viewport.height;
canvas.width = viewport.width;
const renderContext = {
canvasContext: context,
viewport: viewport
};
await page.render(renderContext).promise;
document.body.appendChild(canvas);
}
Extract Text with Coordinates
import * as pdfjsLib from 'pdfjs-dist';
async function extractText() {
const loadingTask = pdfjsLib.getDocument('document.pdf');
const pdf = await loadingTask.promise;
let fullText = '';
// Extract text from all pages
for (let i = 1; i <= pdf.numPages; i++) {
const page = await pdf.getPage(i);
const textContent = await page.getTextContent();
const pageText = textContent.items
.map(item => item.str)
.join(' ');
fullText += `\n--- Page ${i} ---\n${pageText}`;
// Get text with coordinates for advanced processing
const textWithCoords = textContent.items.map(item => ({
text: item.str,
x: item.transform[4],
y: item.transform[5],
width: item.width,
height: item.height
}));
}
console.log(fullText);
return fullText;
}
Extract Annotations and Forms
import * as pdfjsLib from 'pdfjs-dist';
async function extractAnnotations() {
const loadingTask = pdfjsLib.getDocument('annotated.pdf');
const pdf = await loadingTask.promise;
for (let i = 1; i <= pdf.numPages; i++) {
const page = await pdf.getPage(i);
const annotations = await page.getAnnotations();
annotations.forEach(annotation => {
console.log(`Annotation type: ${annotation.subtype}`);
console.log(`Content: ${annotation.contents}`);
console.log(`Coordinates: ${JSON.stringify(annotation.rect)}`);
});
}
}
Advanced Command-Line Operations
poppler-utils Advanced Features
Extract Text with Bounding Box Coordinates
# Extract text with bounding box coordinates (essential for structured data)
pdftotext -bbox-layout document.pdf output.xml
# The XML output contains precise coordinates for each text element
Advanced Image Conversion
# Convert to PNG images with specific resolution
pdftoppm -png -r 300 document.pdf output_prefix
# Convert specific page range with high resolution
pdftoppm -png -r 600 -f 1 -l 3 document.pdf high_res_pages
# Convert to JPEG with quality setting
pdftoppm -jpeg -jpegopt quality=85 -r 200 document.pdf jpeg_output
Extract Embedded Images
# Extract all embedded images with metadata
pdfimages -j -p document.pdf page_images
# List image info without extracting
pdfimages -list document.pdf
# Extract images in their original format
pdfimages -all document.pdf images/img
qpdf Advanced Features
Complex Page Manipulation
# Split PDF into groups of pages
qpdf --split-pages=3 input.pdf output_group_%02d.pdf
# Extract specific pages with complex ranges
qpdf input.pdf --pages input.pdf 1,3-5,8,10-end -- extracted.pdf
# Merge specific pages from multiple PDFs
qpdf --empty --pages doc1.pdf 1-3 doc2.pdf 5-7 doc3.pdf 2,4 -- combined.pdf
PDF Optimization and Repair
# Optimize PDF for web (linearize for streaming)
qpdf --linearize input.pdf optimized.pdf
# Remove unused objects and compress
qpdf --optimize-level=all input.pdf compressed.pdf
# Attempt to repair corrupted PDF structure
qpdf --check input.pdf
qpdf --fix-qdf damaged.pdf repaired.pdf
# Show detailed PDF structure for debugging
qpdf --show-all-pages input.pdf > structure.txt
Advanced Encryption
# Add password protection with specific permissions
qpdf --encrypt user_pass owner_pass 256 --print=none --modify=none -- input.pdf encrypted.pdf
# Check encryption status
qpdf --show-encryption encrypted.pdf
# Remove password protection (requires password)
qpdf --password=secret123 --decrypt encrypted.pdf decrypted.pdf
Advanced Python Techniques
pdfplumber Advanced Features
Extract Text with Precise Coordinates
import pdfplumber
with pdfplumber.open("document.pdf") as pdf:
page = pdf.pages[0]
# Extract all text with coordinates
chars = page.chars
for char in chars[:10]: # First 10 characters
print(f"Char: '{char['text']}' at x:{char['x0']:.1f} y:{char['y0']:.1f}")
# Extract text by bounding box (left, top, right, bottom)
bbox_text = page.within_bbox((100, 100, 400, 200)).extract_text()
Advanced Table Extraction with Custom Settings
import pdfplumber
import pandas as pd
with pdfplumber.open("complex_table.pdf") as pdf:
page = pdf.pages[0]
# Extract tables with custom settings for complex layouts
table_settings = {
"vertical_strategy": "lines",
"horizontal_strategy": "lines",
"snap_tolerance": 3,
"intersection_tolerance": 15
}
tables = page.extract_tables(table_settings)
# Visual debugging for table extraction
img = page.to_image(resolution=150)
img.save("debug_layout.png")
reportlab Advanced Features
Create Professional Reports with Tables
from reportlab.platypus import SimpleDocTemplate, Table, TableStyle, Paragraph
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.lib import colors
# Sample data
data = [
['Product', 'Q1', 'Q2', 'Q3', 'Q4'],
['Widgets', '120', '135', '142', '158'],
['Gadgets', '85', '92', '98', '105']
]
# Create PDF with table
doc = SimpleDocTemplate("report.pdf")
elements = []
# Add title
styles = getSampleStyleSheet()
title = Paragraph("Quarterly Sales Report", styles['Title'])
elements.append(title)
# Add table with advanced styling
table = Table(data)
table.setStyle(TableStyle([
('BACKGROUND', (0, 0), (-1, 0), colors.grey),
('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke),
('ALIGN', (0, 0), (-1, -1), 'CENTER'),
('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'),
('FONTSIZE', (0, 0), (-1, 0), 14),
('BOTTOMPADDING', (0, 0), (-1, 0), 12),
('BACKGROUND', (0, 1), (-1, -1), colors.beige),
('GRID', (0, 0), (-1, -1), 1, colors.black)
]))
elements.append(table)
doc.build(elements)
Complex Workflows
Extract Figures/Images from PDF
Method 1: Using pdfimages (fastest)
# Extract all images with original quality
pdfimages -all document.pdf images/img
Method 2: Using pypdfium2 + Image Processing
import pypdfium2 as pdfium
from PIL import Image
import numpy as np
def extract_figures(pdf_path, output_dir):
pdf = pdfium.PdfDocument(pdf_path)
for page_num, page in enumerate(pdf):
# Render high-resolution page
bitmap = page.render(scale=3.0)
img = bitmap.to_pil()
# Convert to numpy for processing
img_array = np.array(img)
# Simple figure detection (non-white regions)
mask = np.any(img_array != [255, 255, 255], axis=2)
# Find contours and extract bounding boxes
# (This is simplified - real implementation would need more sophisticated detection)
# Save detected figures
# ... implementation depends on specific needs
Batch PDF Processing with Error Handling
import os
import glob
from pypdf import PdfReader, PdfWriter
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def batch_process_pdfs(input_dir, operation='merge'):
pdf_files = glob.glob(os.path.join(input_dir, "*.pdf"))
if operation == 'merge':
writer = PdfWriter()
for pdf_file in pdf_files:
try:
reader = PdfReader(pdf_file)
for page in reader.pages:
writer.add_page(page)
logger.info(f"Processed: {pdf_file}")
except Exception as e:
logger.error(f"Failed to process {pdf_file}: {e}")
continue
with open("batch_merged.pdf", "wb") as output:
writer.write(output)
elif operation == 'extract_text':
for pdf_file in pdf_files:
try:
reader = PdfReader(pdf_file)
text = ""
for page in reader.pages:
text += page.extract_text()
output_file = pdf_file.replace('.pdf', '.txt')
with open(output_file, 'w', encoding='utf-8') as f:
f.write(text)
logger.info(f"Extracted text from: {pdf_file}")
except Exception as e:
logger.error(f"Failed to extract text from {pdf_file}: {e}")
continue
Advanced PDF Cropping
from pypdf import PdfWriter, PdfReader
reader = PdfReader("input.pdf")
writer = PdfWriter()
# Crop page (left, bottom, right, top in points)
page = reader.pages[0]
page.mediabox.left = 50
page.mediabox.bottom = 50
page.mediabox.right = 550
page.mediabox.top = 750
writer.add_page(page)
with open("cropped.pdf", "wb") as output:
writer.write(output)
Performance Optimization Tips
1. For Large PDFs
- Use streaming approaches instead of loading entire PDF in memory
- Use
qpdf --split-pagesfor splitting large files - Process pages individually with pypdfium2
2. For Text Extraction
pdftotext -bbox-layoutis fastest for plain text extraction- Use pdfplumber for structured data and tables
- Avoid
pypdf.extract_text()for very large documents
3. For Image Extraction
pdfimagesis much faster than rendering pages- Use low resolution for previews, high resolution for final output
4. For Form Filling
- pdf-lib maintains form structure better than most alternatives
- Pre-validate form fields before processing
5. Memory Management
# Process PDFs in chunks
def process_large_pdf(pdf_path, chunk_size=10):
reader = PdfReader(pdf_path)
total_pages = len(reader.pages)
for start_idx in range(0, total_pages, chunk_size):
end_idx = min(start_idx + chunk_size, total_pages)
writer = PdfWriter()
for i in range(start_idx, end_idx):
writer.add_page(reader.pages[i])
# Process chunk
with open(f"chunk_{start_idx//chunk_size}.pdf", "wb") as output:
writer.write(output)
Troubleshooting Common Issues
Encrypted PDFs
# Handle password-protected PDFs
from pypdf import PdfReader
try:
reader = PdfReader("encrypted.pdf")
if reader.is_encrypted:
reader.decrypt("password")
except Exception as e:
print(f"Failed to decrypt: {e}")
Corrupted PDFs
# Use qpdf to repair
qpdf --check corrupted.pdf
qpdf --replace-input corrupted.pdf
Text Extraction Issues
# Fallback to OCR for scanned PDFs
import pytesseract
from pdf2image import convert_from_path
def extract_text_with_ocr(pdf_path):
images = convert_from_path(pdf_path)
text = ""
for i, image in enumerate(images):
text += pytesseract.image_to_string(image)
return text
License Information
- pypdf: BSD License
- pdfplumber: MIT License
- pypdfium2: Apache/BSD License
- reportlab: BSD License
- poppler-utils: GPL-2 License
- qpdf: Apache License
- pdf-lib: MIT License
- pdfjs-dist: Apache License