I'm trying to get the distance traveled in a collection of routes on a straight line. The rule to determine the distance is that a route segment counts only once. For example given the segments (50, 70)
and (60, 80)
, the total distance covered is 30 instead of 40, because the segment (60, 70)
is covered twice.
Sample:
╔═══════╦═══════╦═══════╗
║ Route ║ From ║ To ║
╠═══════╬═══════╬═══════╣
║ 1 ║ 100 ║ 110 ║
║ 2 ║ 115 ║ 116 ║
║ 3 ║ 100 ║ 111 ║
║ 4 ║ 0 ║ 15 ║
║ 5 ║ 50 ║ 110 ║
║ 6 ║ 70 ║ 120 ║
╚═══════╩═══════╩═══════╝
In these routes the actual distance traveled was 85m. I made a solution that uses a bit array to control, and a counter to calculate the distance, with this solution the process speed and the memory consumption is good, up to 20250000 meters, after this distance the performance begins to fall. Could they recommend me some algorithm to solve this kind of problem?
Note: These distance are in a straight line, they are not routes that have georeference, they are segments of a straight line.
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
public class MainClass {
static Estacion[] estaciones = new Estacion[]{
new Estacion(100,110),
new Estacion(115,116),
new Estacion(100,111),
new Estacion(0,15),
new Estacion(50,110),
new Estacion(70,120)
};
public static void Main (string[] args) {
Console.WriteLine (Calcular());
}
public static int Calcular()
{
var min =estaciones.Min(m=> m.Desde);
var max =estaciones.Max(m=> m.Hasta);
var recorrido =0;
var cantidad= max-min;
var bA = new BitArray(cantidad,false);
foreach(var e in estaciones.OrderByDescending(o=> o.Longitud))
{
var h= ((e.Hasta-e.Desde)/2);
for(int i= 0;i<= h ; i++)
{
var slot=i+e.Desde-min;
if(slot<cantidad)
if(!bA.Get(slot))
{
bA.Set(slot,true);
recorrido++;
}
slot=e.Hasta-i-min-1;
if(!bA.Get(slot))
{
bA.Set(slot,true);
recorrido++;
}
if(recorrido== cantidad)
return recorrido;
}
}
return recorrido;
}
public class Estacion
{
public Guid Id {get; set;}
public int Desde {get; set;}
public int Hasta {get; set;}
public int Longitud {get; set;}
public Estacion(int a, int b)
{
Desde = a;
Hasta= b;
Longitud= Hasta-Desde;
Id= Guid.NewGuid();
}
}
}
-
5\$\begingroup\$ Just a small note: please use English to write code if you plan to show it to the world. If you want more people are able to improve your code, you should make it understandable as much as possible. There are many people from non-English speaking countries here (including me) and English is the only language which is understandable for all of them. \$\endgroup\$Maxim– Maxim2017年09月02日 03:29:18 +00:00Commented Sep 2, 2017 at 3:29
2 Answers 2
Algorithm
The time and space complexity of the current algorithm depends on the length of intervals in the input. This is very inefficient for inputs with long intervals, even when there are only few intervals.
The problem can be reworded as: merge the intervals and calculate their total length. This should be fairly simple to implement efficiently:
- Sort the intervals by start position ascending, and end position descending. That is,
(5, 10)
should come before(5, 8)
. - Initialize the
previous
interval = the first, andtotal = 0
- Iterate from the second interval until the end, at each step:
- If the current interval is fully included in
previous
, then skip it - If the current interval overlaps with
previous
, then update or replaceprevious
to fully include both the previous and the current - If the current interval doesn't overlap with
previous
, then add the length ofprevious
tototal
, and replaceprevious
with the current
- If the current interval is fully included in
- At the end, add the length of
previous
tototal
Readability
There are serious readability concerns with this code. This writing style is error-prone and confusing:
var slot=i+e.Desde-min; if(slot<cantidad) if(!bA.Get(slot)) { bA.Set(slot,true); recorrido++; } slot=e.Hasta-i-min-1;
The above should have been written like this:
var slot = i + e.Desde - min;
if (slot < cantidad)
{
if (!bA.Get(slot))
{
bA.Set(slot, true);
recorrido++;
}
}
slot = e.Hasta - i - min - 1;
This way the scope of the if
statements is perfectly clear.
I also added spaces around operators and after commas as per the convention.
As a side note, in your algorithm the slot < cantidad
condition is always true, so in this example the first if
statement had no effect,
as if it wasn't even there.
I think your code is not that bad and for small segments and data sets it is all right. May be a better naming of the class Estacion would have turned you in the right analytical direction. I'm not that familiar with your language, but I think Estacion
translates to Station
or Point
in English where a more appropriate name would have been Segment
or LineSegment
.
The class Estacion
is error prone in the following way:
In the constructor you calculate Longitud= Hasta-Desde;
but both Hasta
and Desde
(and Longitud
) are public properties and may be changed in the process which will leave the object in an inconsistent state. In stead you should make the Longitud
a read only Property as public int Longitud { get { return Hasta - Desta; } }
or public int Longitud => Hasta - Desta;
.
As for the algorithm it self I think Janos' analysis says it all.