Jump to content

Using CSV or TDV format for representing multi-level nested data structures

Featured Replies

Please have a look at my proposal to use CSV as lightweight data-interchange format compared to JSON and XML.

 

Multi-level nested CSV TDV.pdf

(or)

http://siara.cc/csv_ml

 

Screenshot of reference implementation:

post-112853-0-98477300-1437640171_thumb.png

 

The demo application (Java) can be download from: http://siara.cc/csv_ml/csv_ml_swing_demo-1.0.0.jar

 

The proposed format is expected to:

  • save storage space (about 50% compared to JSON and 60-70% compared to XML)
  • increase data transfer speeds
  • be faster to parse compared to XML and JSON
  • allow full schema definition and validation
  • make schema definition simple, lightweight and in-line compared to DTD or XML Schema
  • allow database binding
  • be used in EAI (Application Integration) for import and export of data
  • be simpler to parse, allowing data to be available even in low memory devices

The given demos convert between CSV, XML and JSON (CSV to XML DOM, CSV to JSON, XML to CSV).

 

Github home page: https://github.com/siara-cc/csv_ml

You could save even more storage space, if you would introduce banks of available answers for some fields.

f.e. assign chemistry=1,physics=2,mathematics=3 etc.

and then use 1,2,3 instead of full-text versions.

 

BTW, XML is highly extendable. One can write loader/saver which ignores unknown tags and attributes (from older/newer version of software) and there is high chance it'll work.

Your's solution, won't. It's very limited, not extendable.

Edited by Sensei

  • Author

Hi @Sensei,

 

Thank you.. I think your first idea is about having a dictionary for some fields and using index positions to refer to it.. It is an excellent suggestion and I want to add it. Where the possible values are fixed, such as in a LOV (List of Values), it will save even more space.

 

While I agree with your second opinion, this is the way I look at it: XML is too open to suit my purpose. I am not trying to replace XML for all the purposes it is being used for, but only in case of representing relational data, where schema is more or less fixed and there is huge amount of data involved for storage and transfer, as in three-tier architecture and rdbms.

 

If this is adopted for new designs this could tremendously avoid redundant data being transferred and processed. I think this would make a positive impact on the amount of energy being used (in terms of watts).

 

But I don't understand where my solution is not extendable. Can give an example?

 

Again thanks for the valuable feedback.

 

Regards

Arun

Archived

This topic is now archived and is closed to further replies.

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.