I have a large list of objects where each object has a unique
(non-overlapping) date range. The list is sorted chronologically . What is
the most efficient way to search this list for a single object that spans a
specified date? 9 3141
Hello Rick,
>I have a large list of objects where each object has a unique (non-overlapping) date range. The list is sorted chronologically . What is the most efficient way to search this list for a single object that spans a specified date?
My suggestion below... you mention that the list is sorted
chronologically . I don't think that information is of any use here, as
long as you don't have any good information about the distribution of the
elements in the list, or the lengths of the spans covered by a typical
element. Of course you could implement an algorithm that looks at a first
element and tries to heuristically guess where the element you're looking
for may be in the list - but without knowing the total range of elements,
the number, the distribution and the lengths of the spans, I find it very
hard to say whether the algorithm would be any more efficient than the
basic one below.
If you want efficiency without having good estimates of these details,
maybe you should organize your data in a different way - using
dictionaries for certain ranges of data, for instance. Once more it's hard
to say what that structure should best look like, without knowing details
about the data you're expecting.
And finally, don't forget: it's usually better to write a reasonably good
algorithm and to optimize it only when it proves to be a problem.
[TestFixture]
public class Tests {
[Test]
public void Test( ) {
List<ClassValid WithinRangelist = new List<ClassValid WithinRange>(ne w
ClassValidWithi nRange[] {
new ClassValidWithi nRange(new DateTime(2007, 1, 1), new DateTime(2007,
1,15)),
new ClassValidWithi nRange(new DateTime(2007, 1, 16), new
DateTime(2007, 1,31)),
new ClassValidWithi nRange(new DateTime(2007, 2, 1), new DateTime(2007,
2,28))
});
ClassValidWithi nRange checkVal = new ClassValidWithi nRange(new
DateTime(2007, 3, 1), new DateTime(2007, 3, 31));
list.Add(checkV al);
ClassValidWithi nRange foundVal =
list.Find(deleg ate(ClassValidW ithinRange entry) {
return entry.IsValidOn (new DateTime(2007, 3, 15));
});
Assert.AreEqual (checkVal, foundVal);
}
}
public class ClassValidWithi nRange {
public ClassValidWithi nRange(DateTime from, DateTime to) {
this.from = from;
this.to = to;
}
public bool IsValidOn(DateT ime date) {
// Implement with your choice of <, <=, and >=
return from < date && to date;
}
private DateTime from;
public DateTime From {
get {
return from;
}
set {
from = value;
}
}
private DateTime to;
public DateTime To {
get {
return to;
}
set {
to = value;
}
}
}
Oliver Sturm
-- http://www.sturmnet.org/blog
Thanks Oliver.
In short, you recommend using the Find method of the List<Tclass. This
performs a linear search, which is basically what I'm doing now. It's
acceptable but hardly optimal. I believe that a binary search could be more
efficient, even where the distribution is uneven. The problem is that the
BinarySearch methods provided by .NET don't seem to support searching using
an IComparer alone.
"Oliver Sturm" <ol****@sturmne t.orgwrote in message
news:xn******** ********@msnews .microsoft.com. ..
Hello Rick,
>>I have a large list of objects where each object has a unique (non-overlapping) date range. The list is sorted chronologically . What is the most efficient way to search this list for a single object that spans a specified date?
My suggestion below... you mention that the list is sorted
chronologically . I don't think that information is of any use here, as
long as you don't have any good information about the distribution of the
elements in the list, or the lengths of the spans covered by a typical
element. Of course you could implement an algorithm that looks at a first
element and tries to heuristically guess where the element you're looking
for may be in the list - but without knowing the total range of elements,
the number, the distribution and the lengths of the spans, I find it very
hard to say whether the algorithm would be any more efficient than the
basic one below.
If you want efficiency without having good estimates of these details,
maybe you should organize your data in a different way - using
dictionaries for certain ranges of data, for instance. Once more it's hard
to say what that structure should best look like, without knowing details
about the data you're expecting.
And finally, don't forget: it's usually better to write a reasonably good
algorithm and to optimize it only when it proves to be a problem.
[TestFixture]
public class Tests {
[Test]
public void Test( ) {
List<ClassValid WithinRangelist = new List<ClassValid WithinRange>(ne w
ClassValidWithi nRange[] {
new ClassValidWithi nRange(new DateTime(2007, 1, 1), new DateTime(2007,
1,15)),
new ClassValidWithi nRange(new DateTime(2007, 1, 16), new DateTime(2007,
1,31)),
new ClassValidWithi nRange(new DateTime(2007, 2, 1), new DateTime(2007,
2,28))
});
ClassValidWithi nRange checkVal = new ClassValidWithi nRange(new
DateTime(2007, 3, 1), new DateTime(2007, 3, 31));
list.Add(checkV al);
ClassValidWithi nRange foundVal = list.Find(deleg ate(ClassValidW ithinRange
entry) {
return entry.IsValidOn (new DateTime(2007, 3, 15));
});
Assert.AreEqual (checkVal, foundVal);
}
}
public class ClassValidWithi nRange {
public ClassValidWithi nRange(DateTime from, DateTime to) {
this.from = from;
this.to = to;
}
public bool IsValidOn(DateT ime date) {
// Implement with your choice of <, <=, and >=
return from < date && to date;
}
private DateTime from;
public DateTime From {
get {
return from;
}
set {
from = value;
}
}
private DateTime to;
public DateTime To {
get {
return to;
}
set {
to = value;
}
}
}
Oliver Sturm
-- http://www.sturmnet.org/blog
I think I found an answer. I could create an IComparer that compares the
upperbound of each date range. I can then use a BinarySearch to look for an
object with a specific upperbound (equal to my desired date). It will return
either the index of an exact match, or the bitwise complement of the index
next highest object, or the bitwise complement of the list count.
"Rick" <ri**@nospam.co mwrote in message
news:u6******** ******@TK2MSFTN GP06.phx.gbl...
Thanks Oliver.
In short, you recommend using the Find method of the List<Tclass. This
performs a linear search, which is basically what I'm doing now. It's
acceptable but hardly optimal. I believe that a binary search could be
more efficient, even where the distribution is uneven. The problem is that
the BinarySearch methods provided by .NET don't seem to support searching
using an IComparer alone.
"Oliver Sturm" <ol****@sturmne t.orgwrote in message
news:xn******** ********@msnews .microsoft.com. ..
>Hello Rick,
>>>I have a large list of objects where each object has a unique (non-overlapping) date range. The list is sorted chronologically . What is the most efficient way to search this list for a single object that spans a specified date?
My suggestion below... you mention that the list is sorted chronologicall y. I don't think that information is of any use here, as long as you don't have any good information about the distribution of the elements in the list, or the lengths of the spans covered by a typical element. Of course you could implement an algorithm that looks at a first element and tries to heuristically guess where the element you're looking for may be in the list - but without knowing the total range of elements, the number, the distribution and the lengths of the spans, I find it very hard to say whether the algorithm would be any more efficient than the basic one below.
If you want efficiency without having good estimates of these details, maybe you should organize your data in a different way - using dictionaries for certain ranges of data, for instance. Once more it's hard to say what that structure should best look like, without knowing details about the data you're expecting.
And finally, don't forget: it's usually better to write a reasonably good algorithm and to optimize it only when it proves to be a problem. [TestFixture] public class Tests { [Test] public void Test( ) { List<ClassVali dWithinRangelis t = new List<ClassValid WithinRange>(ne w ClassValidWith inRange[] { new ClassValidWithi nRange(new DateTime(2007, 1, 1), new DateTime(2007, 1,15)), new ClassValidWithi nRange(new DateTime(2007, 1, 16), new DateTime(2007, 1,31)), new ClassValidWithi nRange(new DateTime(2007, 2, 1), new DateTime(2007, 2,28)) }); ClassValidWith inRange checkVal = new ClassValidWithi nRange(new DateTime(200 7, 3, 1), new DateTime(2007, 3, 31)); list.Add(check Val);
ClassValidWith inRange foundVal = list.Find(deleg ate(ClassValidW ithinRange entry) { return entry.IsValidOn (new DateTime(2007, 3, 15)); });
Assert.AreEqua l(checkVal, foundVal); } }
public class ClassValidWithi nRange { public ClassValidWithi nRange(DateTime from, DateTime to) { this.from = from; this.to = to; } public bool IsValidOn(DateT ime date) { // Implement with your choice of <, <=, and >= return from < date && to date; } private DateTime from; public DateTime From { get { return from; } set { from = value; } } private DateTime to; public DateTime To { get { return to; } set { to = value; } } }
Oliver Sturm -- http://www.sturmnet.org/blog
Hello Rick,
>In short, you recommend using the Find method of the List<Tclass. This performs a linear search, which is basically what I'm doing now. It's acceptable but hardly optimal. I believe that a binary search could be more efficient, even where the distribution is uneven.
Hm... I'm not sure really - it's a mathematical exercise to prove whether
it would be more efficient under the circumstances or not. There are two
problems:
* the distribution may be seriously uneven, we (I at least) don't have any information on that
* the decision of whether or not a date falls in a given range is not only dependent on one distributed value, but on two different ones.
The second problem brings the length of ranges into the equation - if
there is some regularity to this, the task of estimating efficiency may
become easier.
In the end I think that theoretically a binary search could improve
efficiency, if done right. But as we seem to know practically nothing
about the content of the data list, it would be rather hard to implement
the binary search in a way that is more efficient on average than a linear
search. And of course there's always the storage to think about... linear
search could be made more efficient very easily by splitting the list
based on years, or something similar.
Other things I would think about... I think a linear search algorithm
should probably be efficient enough for a large majority of use cases. In
those cases where the number of objects would be too large to use linear
searching efficiently, it seems like a good question to ask why these huge
numbers of objects are being handled in-memory in the first place - a
relational database could make the search operation much easier and a lot
faster in these particular cases.
Oliver Sturm
-- http://www.sturmnet.org/blog
Hello Rick,
>I think I found an answer. I could create an IComparer that compares the upperbound of each date range. I can then use a BinarySearch to look for an object with a specific upperbound (equal to my desired date). It will return either the index of an exact match, or the bitwise complement of the index next highest object, or the bitwise complement of the list count.
Have fun :-)
Oliver Sturm
-- http://www.sturmnet.org/blog
"Oliver Sturm" <ol****@sturmne t.orgwrote in message
news:xn******** ********@msnews .microsoft.com. ..
Hello Rick,
>>In short, you recommend using the Find method of the List<Tclass. This performs a linear search, which is basically what I'm doing now. It's acceptable but hardly optimal. I believe that a binary search could be more efficient, even where the distribution is uneven.
Hm... I'm not sure really - it's a mathematical exercise to prove
whether it would be more efficient under the circumstances or not.
There are two problems:
* the distribution may be seriously uneven, we (I at least) don't
have any information on that
* the decision of whether or not a date falls in a given range is not
only dependent on one distributed value, but on two different ones.
Neither of these points will effect the speed of the search in the
slightest.
Remember, we are not talking about a (possibly imbalanced) binary tree.
We are talking about using a binary search algorithm over a sorted list.
That algorithm will give you the same performance as a perfectly
balanced tree. Uneven distributions of the data will not impact the
search in the slightest.
Also remember that Rick said:
<quote>
I have a large list of objects where each object has a unique
(non-overlapping) date range. The list is sorted chronologically
</quote>
Since the date ranges do NOT overlap there is a clear sequence of data
points
obj1.Start
obj1.End
obj2.Start
obj2.End
....
So, You perform a Binary search using either Start or end (the choice is
irrelevant). This seach will go as O(logN). If you were lucky enough to
hit the date on the money, then you are done. If you did not find the
exact date (far more likely), the algorithm will return a negative
number, which is the bitwise complement of the index of the next
element. In other words, the algorithm has in effect told you which item
in the Array you need to examine (off by one is more like it). Then it
is a simple matter of seeing if your date falls within the Start and End
dates for the object in question.
Of course, unless the number of data points is substantial, the total
time would be negligable in either case.
Hope this helps
Bill
<snip>
Hello Bill,
I understand I was probably mixing up some things here - my ideas were
targeted at a more efficient algorithm that would take the structure into
account, based on the thought that the amount of data in question is so
vast that we start running into problems with the linear search to begin
with.
Oliver Sturm
-- http://www.sturmnet.org/blog
"Oliver Sturm" <ol****@sturmne t.orgwrote in message
news:xn******** ********@msnews .microsoft.com. ..
Hello Bill,
I understand I was probably mixing up some things here - my ideas were
targeted at a more efficient algorithm that would take the structure
into account, based on the thought that the amount of data in question
is so vast that we start running into problems with the linear search
to begin with.
You do bring up a good point.
Knowledge of your data, and how it is likely to be searched can go a
LONG way towards picking the best approach.
If most of the time you are accessing data near the ends of the range
the linear search may in fact outperform the binary search which does
better for random access.
If the number of data points is not large, it doesn't matter how you
search, since the time is negligible in either case. Unless you are
sitting in a tight loop processing requests continuously that is.
Bill
Hello Bill,
>Knowledge of your data, and how it is likely to be searched can go a LONG way towards picking the best approach.
That's what I had in mind - maybe too much so, granted.
>If the number of data points is not large, it doesn't matter how you search, since the time is negligible in either case. Unless you are sitting in a tight loop processing requests continuously that is.
Oh, of course. I didn't mean my comments somewhere on the thread to say
that it's meaningless to search for efficiency in this regard. But I find
it significant to think about the best way to achieve efficiency - for
instance by changing the storage structure instead of optimizing searches
-, and this depends on the use case a lot.
Oliver Sturm
-- http://www.sturmnet.org/blog This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: VK |
last post by:
Hello,
In my object I have getDirectory() method which returns 2-dimentional
array
(or an imitation of 2-dimentional array using two JavaScript objects
with auto-handled length property - please let's us do not go into an
"each dot over i" clarification discussion now - however you want to
call - you call it ;-)
array contains records of all files in the current dir.
array contains records of all subs in the current dir
|
by: decrypted |
last post by:
I am having trouble with creating / finding effective searching algorithms.
I constantly run into a situation where I have a list of objects and need to
find out if a property of one of these objects matches some
criteria...especialy if the criteria is a list itself. Example...
class Part has private PartID
A listview of items with each tag having a Part object.
I have an array of PartID's denoting the parts I want to remove from the...
|
by: Ben Fidge |
last post by:
Hi
What is the most efficient way to gather a list of files
under a given folder recursively with the ability to
specify exclusion file masks?
For example, say I wanted to get a list of all files
under \Program Files and it's subdirectories that meet
the following criteria:
|
by: Russell |
last post by:
Hey,
ok i have numerous tables to search through for a 'site search'.
some of the searchble fields have html embeded within so after some quick
referencing, saw I can use the regExp function to strip out all the HTML
leaving only the raw text.
(done and works a treat)
My issue is:
|
by: Chung Leong |
last post by:
Here's the rest of the tutorial I started earlier:
Aside from text within a document, Indexing Service let you search on
meta information stored in the files. For example, MusicArtist and
MusicAlbum let you find MP3 and other music files based on the singer
and album name; DocAuthor let you find Office documents created by a
certain user; DocAppName let you find files of a particular program,
and so on.
Indexing Service uses plug-ins...
| |
by: Amit Bhatia |
last post by:
Hi,
I am not sure if this belongs to this group. Anyway, my question is as
follows: I have a list (STL list) whose elements are pairs of integers (STL
pairs, say objects of class T). When I create a new object of class T, I
would like to check if this object already exists in the list: meaning one
having same integers. This can be done in linear time in a list, and
probably faster if I use STL Set instead of list. I am wondering however if...
|
by: Computer geek |
last post by:
Hello, I am new to VB.NET and programming in general. I have taught
myself a lot of the basics with vb.net but am still quite the novice.
I am working on a little application now and I need some help with one
part of the code. When a button is clicked I need to have it go out to
a network drive location and count how many files are present with a
certain file extension. Then store that number in a declared variable.
Is this possible? Can...
|
by: =?Utf-8?B?YWJjZA==?= |
last post by:
I have a generic list of objects. Each object contains multiple objects and
one of the object has some properties. Now I need to filter the list based on
those properties. Whats the best way for me. I tried foreach loop but I am
unable to write the filter criteria. All the proeperties I need to filter
must have AND operation. I need to consider conditions like blank etc. To use
list.Find I need to write predicate...any samples anywhere...
...
|
by: Eugenio |
last post by:
Hi there, I'm returning to this forum for the second time and I would like to say thanks for the great help provided. I've encountered a new problem now and hope that you will be able to help me again.
I'm currently implementing a container monitoring system in MSAccess07 and I'm using a multiple search function to sort containers. I use several unbound text boxes and a list box.The user type's in partial search criteria in one or several...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |