Jump to content

Data Structure Optimization


Rasori

Recommended Posts

Hi all,

 

I'm trying to find out how to establish when to optimize for processor usage versus memory usage. In the application I'm working on it's not actually a big deal (not large-scale enough to worry about) but I'm curious how it's decided for more critical tasks.

 

The situation is something like this:

Vehicle X has Customer Y, where Y is the Owner of X.

It would be possible for each Customer Y to have a list of all Vehicles they own stored within them.

Which would be preferred: the increased memory usage of having each Customer Y list all Vehicles they own, or the increased processing time required to go through all Vehicles and return a list of all Vehicles where Vehicle.Owner is Customer Y?

 

In my application, with a locally-stored database that isn't remarkably large, I intend to just go ahead and use the extra memory. Both usage cases (needing to select one of the customer's vehicles in order to start an invoice or similar, or needing to find the owner of a certain vehicle currently on the premises) are quite likely to crop up, so I figure it would be nice to be able to approach the problem equally effectively from either direction.

 

Is there a best-practice to determine how to make a decision like this under more critical circumstances?

 

Thanks!

Link to comment
Share on other sites

I've been trying to get this post up, but I've been travelling:

 

I've actually been thinking about Xittenn's suggestion quite a bit and I'm trying to determine if it really is optimized.

 

Say each customer has a list of Vehicle references (presumably primary key values for the corresponding vehicles). To then pull up the information for those vehicles still requires a query on the entire database of vehicle entries, does it not? Whereas before the query was (in pseudocode) "get all VEHICLES where OWNER is Owner_Name," now isn't it "get all VEHICLES where PRIMARY_KEY is one of [PK_Values]" ?

 

It seems a massive memory improvement over storing the entire vehicle, but does it truly save performance?

 

I think, basically, this question comes down to: are databases generally optimized for Primary Key checks? If (as I believe) yes, then just how speedy is this optimized query?

Link to comment
Share on other sites

If this is an SQL database as you imply, the primary key column should be indexed by the database engine for fast searching, and should be very fast. Of course, you can also add an index to the Owner_Name column and get similar speed.

 

Actually, I'd do this with three tables:

 

customer

  • customer_id
  • name
  • etc.

vehicle

  • vehicle_id
  • make
  • model
  • etc.

vehicle_owners

  • vehicle_id
  • customer_id

Then you just SELECT * FROM vehicle, customer, vehicle_owners WHERE customer.customer_id = vehicle_owners.customer_id AND vehicle.vehicle_id = vehicle_owners.vehicle_id AND customer.customer_id = theoneI'mlookingfor. Indexes on all the *_id columns and this should be blazing fast. On a database with FOREIGN KEY constraints available, it'll even provide checks to make sure a vehicle doesn't have a nonexistent owner and such.

Link to comment
Share on other sites

Say each customer has a list of Vehicle references (presumably primary key values for the corresponding vehicles). To then pull up the information for those vehicles still requires a query on the entire database of vehicle entries, does it not? Whereas before the query was (in pseudocode) "get all VEHICLES where OWNER is Owner_Name," now isn't it "get all VEHICLES where PRIMARY_KEY is one of [PK_Values]" ?

 

It seems a massive memory improvement over storing the entire vehicle, but does it truly save performance?

 

I think, basically, this question comes down to: are databases generally optimized for Primary Key checks? If (as I believe) yes, then just how speedy is this optimized query?

 

Forgive me much:

http://msdn.microsoft.com/en-us/library/aa964133%28v=sql.90%29.aspx

Link to comment
Share on other sites

assign an n-bit key to each vehicle. the first few bits indicate the owner id(obviously theres an id field for each owner also). the last few are for unique identification. sort the list of owners in the database. then, everytime u have a vehicle, you can use a binary search to find the owner efficiently, and if you have an owner, its easy to find his vehicles.

since were using bitfield entries instead of strings, it occupies much less place in memory, and its still efficient.

Link to comment
Share on other sites

  • 3 weeks later...

the first thing comes to a developer's mind, when talking about Optimization & Data Structure is Hashing Table !

 

another point is, even Data Structures are logical when a program is running, but you can create a virtual cache on use ...

 

Graph frequency of use on the memory levels, or balance your priority-based AVL trees.

 

Anyway, good luck

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.