Extending libdxfrw: Handling Custom Entities and Attributeslibdxfrw is a compact C++ library for reading and writing DXF files. It focuses on core functionality and a lightweight, extensible design, which makes it a solid base when you need to work with DXF data programmatically. This article explains how to extend libdxfrw to handle custom entities and attributes, covering design considerations, concrete implementation steps, examples, and testing strategies.
Why extend libdxfrw?
DXF files exported by various CAD systems often include custom entities or nonstandard attributes (often stored as XDATA, extended dictionaries, or proprietary entity types). To preserve or manipulate these items you must either:
- Parse and retain the raw data so it isn’t lost when re-saving the file, or
- Implement typed support so your application can interpret, validate, and modify the custom content.
libdxfrw’s minimalist design makes both approaches possible: you can either store unknown entities as generic containers or add new typed classes and handlers integrated with the library’s read/write flow.
Overview of DXF mechanisms for custom data
Before implementation, understand where custom data appears in DXF:
- XDATA (Extended Entity Data): application-specific data attached to entities, marked by an application name and a sequence of group codes (1000–1071, 1002).
- Dictionaries (AcDbDictionary): stored in the objects section, can hold named entries and reference arbitrary objects; custom applications often use dictionaries to store metadata.
- Custom entity types: some exporters use custom entity names (e.g., “ACAD_PROXY_ENTITY” or vendor-specific type names) or proxy entities to represent unknown objects.
- APPID and REGISTERs: application registrations used by XDATA consumers.
- Group codes and custom group-code-driven attributes: some apps put nonstandard meanings into standard group codes or use reserved groups.
Understanding which mechanism your target CAD application uses is essential to designing an extension.
Design approaches
Two main approaches to extend libdxfrw:
-
Preserve-as-unknown (safe, quick)
- When encountering an unknown entity or group of XDATA, capture and store its raw group-code stream and any associated metadata. When writing, emit the raw data back unchanged.
- Pros: minimal code, preserves data losslessly, fast to implement.
- Cons: no typed access, harder to manipulate internals.
-
Typed extension (richer, more work)
- Implement concrete C++ classes representing the custom entities/attributes, integrate them into the parsing pipeline, and map serialized group codes into structured fields. Provide read/write methods, validation, and helper accessors.
- Pros: structured access, easier to modify and validate data, cleaner integration with app logic.
- Cons: more code, must handle compatibility and versioning.
Often a hybrid approach works best: preserve unknown content by default, but add typed classes for the specific custom items you need to work with.
libdxfrw internals (relevant parts)
Key pieces of libdxfrw to hook into:
- drw_reader / drw_interface: the reader callbacks that are invoked per entity or section. Implementations of drw_interface receive parsed entities and are responsible for application behavior.
- drw_base_entity and derived classes: entity classes like drw_line, drw_circle, etc. New typed entities should derive from drw_entity/drw_base_entity following the library’s conventions.
- drw_text, drw_block, drw_objects: structures representing sections and object data.
- Writer classes: drw_writer / dwg classes that serialize entities back to DXF.
You’ll modify or extend the parser to recognize your custom entity names or to capture XDATA/dictionary entries and route them into your new classes or containers.
Step-by-step: Preserving unknown entities and XDATA
-
Identify where unknown entities are currently handled.
- The library’s reader typically creates known entity objects and may drop or create proxy objects for unknown types. Inspect drw_reader and the part of code that handles group codes for unknown types.
-
Create a generic container class for unknown content.
- Example structure:
class drw_unknown_entity : public drw_entity { std::string raw_type; // entity type name as found in DXF std::vector<std::pair<int, std::string>> groups; // group-code/value pairs };
- Store both numeric group code and the raw string value so the original stream can be reconstructed.
- Example structure:
-
Capture XDATA attachments.
- XDATA appears in group codes prefixed by 1001 (application name) then codes in 1000–1071 ranges, and 1002 { … } delimiters. When drw_reader sees 1001, collect the subsequent group codes until XDATA ends and attach them to the entity container (or to a dedicated XData container class).
-
Preserve dictionaries and objects.
- For AcDbDictionary entries and OBJECTS section entries, implement a similar container to hold their raw group streams and references. Make sure to capture handles and owner references to preserve graph structure.
-
Write back unchanged.
- Implement drw_unknown_entity::write(drw_writer& out) to iterate the stored groups and emit the exact group codes and values. For XDATA preserve the ⁄1001 framing.
-
Tests
- Round-trip test: read a DXF with custom entities, write it back, and compare the original and output (ignoring ordering differences that DXF allows). A byte-for-byte match may not always be possible due to formatting differences; instead compare parsed group sequences or use a DXF-aware differ.
Step-by-step: Implementing typed custom entities
- Define the C++ class
- Inherit from the base entity class and add fields for each attribute your custom entity requires.
- Provide clear constructors and default values.
Example:
class drw_custom_box : public drw_entity { public: drw_point minCorner; drw_point maxCorner; int customFlag; std::string label; drw_custom_box() : customFlag(0) {} void parseGroup(int code, const std::string &value); void write(drw_writer &out) const; };
-
Parsing logic
- Add code in the main reader dispatch (or register a handler) to instantiate drw_custom_box when the entity type string matches (e.g., “MYBOX”).
- In parseGroup, switch on group codes to populate fields. For unknown groups, either store them in a fallback container (for lossless writes) or log/warn.
-
Serialization
- Implement write() to emit the DXF entity header (entity type, handle, layer, color, etc.) and your entity’s group codes in the correct order and format. Include any XDATA/dictionary entries if present.
-
Registration and factory
- If libdxfrw uses a factory or string->constructor mapping for entities, register your new entity class so it’s created during parsing.
- Alternatively, extend the reader’s entity-dispatch to check for your custom name before falling back to unknown handlers.
-
Integration with XDATA/dictionary if needed
- If your custom entity uses XDATA for additional fields, implement reading of XDATA into typed fields. Validate the application name (1001) and parse subsequent codes accordingly.
-
Maintain backwards compatibility
- When reading files produced by other apps, they may include partial or differently-ordered group codes. Implement tolerant parsing: accept both optional and required codes, use defaults, and log parsing warnings rather than failing.
Example: Handling a hypothetical MYBOX entity
Suppose a vendor writes rectangles as a custom entity named “MYBOX” with group codes:
- 10,20: min corner X,Y
- 11,21: max corner X,Y
- 70: flags (integer)
- 1: label (string)
- XDATA under application “MYAPP” with extra metadata (⁄1040 codes)
Implementation sketch:
- drw_custom_box fields: minCorner{10,20}, maxCorner{11,21}, int flags, std::string label, XDataContainer xdata.
- Reader: on encountering entity name “MYBOX”, instantiate drw_custom_box and call parseGroup for each following group code until entity end.
- parseGroup handles codes 10/20/11/21/70/1; when encountering 1001==“MYAPP”, parse XDATA pairs (1000 strings, 1040 doubles).
- write(): emit entity header and group codes 10/20/11/21/70/1, then emit XDATA block with application name and its group codes.
Code examples in libdxfrw style usually require following the library’s existing parse/write signatures; adapt the above to fit.
Handling proxies and vendor-specific proxy entities
Some DXF exporters use proxy entities (e.g., ACAD_PROXY_ENTITY) to store unknown/custom objects. They often include an embedded binary or encoded stream describing the original object. To handle proxy entities:
- Capture the proxy’s data (usually stored as extended data or proprietary group codes) into a binary buffer or encoded string field.
- If vendor format is known and documented, implement a decoder that extracts meaningful fields and exposes typed accessors.
- If not known, preserve raw blob data and re-emit it unmodified when saving.
Working with dictionaries and object references
Custom attributes are often stored in dictionaries where entries reference objects by handle. To preserve and manipulate:
- Parse OBJECTS section and build a handle->object map.
- When reading dictionaries, capture key/value pairs, noting whether values are handles to objects or literal values.
- If you add or modify dictionary entries, ensure unique handles are generated and references are updated.
- When writing, serialize objects first (ensuring handles are assigned) then dictionaries that reference them.
XDATA specifics and best practices
- Always check application name (1001) when parsing XDATA to ensure you interpret values in the correct schema.
- XDATA field types in DXF include strings (1000), control strings (1002), doubles (1040), integers (⁄1071), and nested structures. Map them into strongly-typed containers.
- Preserve unknown XDATA fields if not recognized.
- Beware of 1002 group-code blocks that use braces { } to mark nested structures—parse them carefully.
Error handling and robustness
- Be permissive in parsing: accept missing optional groups, ignore unknown group codes (but store if preserving), and normalize whitespace in string fields.
- Emit warnings or log messages for unexpected values, but avoid hard failures on malformed vendor files.
- Validate on write: ensure required group codes exist and handles are consistent.
Performance considerations
- When preserving raw group streams, store them compactly (e.g., as vector of small structs) to avoid excessive string allocations.
- Lazy-parse XDATA only when the application needs to inspect it; otherwise keep as opaque data for faster read/write.
- For very large DXFs, stream processing (read–process–write) without constructing full in-memory models reduces memory usage.
Testing and validation
-
Create unit tests for:
- Parsing known custom entities from sample DXFs.
- Round-trip preservation: read → write → re-read and compare semantic fields.
- Dictionary and handle integrity: adding/removing entries, ensure handles resolve.
- XDATA parsing/serialization for all supported types.
-
Use real-world DXF samples from the vendor when possible. If not available, construct test files that mimic the vendor’s group-code patterns.
Documentation and distribution
- Document the new entity classes, their group-code mapping, and any XDATA schema (application name, expected codes/types).
- If distributing your changes upstream, follow libdxfrw contribution guidelines: keep changes minimal, well-documented, and provide test cases.
Example checklist for implementing support
- [ ] Decide preserve-only vs typed implementation
- [ ] Inspect library reader/dispatcher for integration point
- [ ] Implement container or class for custom entity
- [ ] Capture and preserve XDATA/dictionary entries
- [ ] Implement write serialization to reproduce original stream
- [ ] Add unit tests and sample DXF files
- [ ] Optimize memory and streaming behavior for large files
- [ ] Document public API and usage
Extending libdxfrw to handle custom entities and attributes is mostly engineering: choose whether you need typed access or lossless preservation, integrate with the reader/writer hooks, parse the appropriate group codes (including XDATA and dictionaries), and ensure robust, well-tested serialization. The approach outlined above balances practicality (preserve unknown data) and depth (implement typed classes when you need to manipulate custom content).
Leave a Reply