Register | Login
Attackpoint - performance and training tools for orienteering athletes

Discussion: Understanding Terje Mathisen's scripts

in: Orienteering; Gear & Toys

Jun 15, 2018 8:40 PM # 
First and foremost, hello and welcome to all.

As a short introduction, my name is Michal, from Poland, and I'm relatively new to orienteering but trying hard not only to participate, but also to start organising events with my club. I took upon myself to prepare timing system and maps, designed and built a timing stations based on great idea of Stripe Predanic - Open Timing Control, and now I am trying to get as much as possible knowledge about automating map creation.
I've found a number of tools and started tinkering with them. I think I understand now how interesting are LiDAR possibilities, so I bought some LiDAR data (unfortunalty this is not free in Poland, but on the other hand it costs about 1eur per square km so not a huge cost for orienteering map. I started with OL-Laser, then found Karttapullautin (and actually even learned how to write the name correctly I think at last!), and then Terje's scripts. I like how they work and how they are completely open, but even though I work as a programmer, this was my first encounter with PERL. As long as I understand what a code was supposed to be doing, I can understand the language as well, but I got stuck on the vegetation generation scripts.
The "problem" I'm having is getting A LOT of green-striped areas. And I simply can't get my mind around how exactly vegetation script is supposed to be working. I've added a command line switch to disable stripes beeing drawn, and it makes my maps nicer, I'll try to use facit file, but still it would be better to understand how is the need to draw stripes determined, then I could probably alter classification to fit local vegetation types better.
As an example please have a look at the following map:
This was created by running OSM2XMAP program on OSM data, then calculating contours and vegetation from LiDAR and at the end adding vector data from polish Terrain Objects Data Base (BDOT) for the 1:10000 scale. The last is beeing distributed as XML files so I imported those into QGIS and then exported to DXF. I still have to work out why this method doesn't work on lines (only points and areas), or of course write my own xml to dxf script, but this is not likely - I do not know dxf specification for now.

I am sorry for the chaotic nature of this post. I just would be greateful for pointing me in the right directon on the logic behind the script's algorithm.
Jun 16, 2018 12:40 PM # 
Michal, maybe Terje is willing to help you directly. He is here on AP -
Jun 16, 2018 1:26 PM # 
Yes, I've seen and this is main reason I've asked my question here on AttackPoint - seeing people here as friendly and helpful (I've read all the posts I could find on the scripts), also I hope either he or some other people who used his scripts might help me and maybe some other lost soul who might come around later with similar problems - that's why I chose to create a discussion rather than mailing Terje directly.
There are explanations on how the runnability is determined and this I might be able to work out, but undergrowth is beyond by understanding at the moment.
Jun 16, 2018 8:03 PM # 
Friendly and helpful? Cue derailing of thread. Go. ;-)
Jun 21, 2018 9:52 AM # 
I just noticed it is highly improbable not to get an explanation, IF the question is not asked in the middle of huge event such as Jukola, when everyone is too busy following the competition (or participating).

I just wanted to let everyone know I think I finally managed to grasp most of the ideas behind the script, commenting it a lot in the process.
I still find a few thing baffling but there is light at the end of the tunnel.
Jun 21, 2018 10:58 AM # 
Terje Mathisen:
Which version of the scripts are you using?

My current set assumes by default that you have properly ground-classified points, but all the rest is pretty much unclassified. It is adaptive in that the tiles are subdivided (from 256x256 to 128x128 or even 64x64) until small enough to work within the limits of the free LAStools executables.

You can run lasground_new on data that is missing proper ground classification, but this is sometimes a trial & error proposition in that it is easy to get too few ground points under dense vegetation, or you get trees and brush classified as ground, i.e. fake knolls and hillocks.

EDIT: I have just uploaded the latest versions of all the adaptive scripts to
Jun 21, 2018 11:20 AM # 
Terje Mathisen:
Arashi, if you can put an interesting subset of your lidar somewhere I'd like to take a look to see if I can figure out your green stripes!

The green stripes are supposed to occur when you have significant density of returns in the knee to chest range (30 to 130 cm), but I have seen spurious stripes in the middle of California parking lots (!) when the real problem was bad height calibration of adjacent flight lines. I.e. in the overlap area which can be a significant fraction of the stripe width, the next flight line would turn up as points 40-50 cm above the first and therefore give an awful lot of green stripes.
Jun 21, 2018 1:02 PM # 
Hi Terje!
Thanks for the interest. I will try to upload some samples this evening

I was using a set of scrips taken from your page containing your article and referenced somewhere here on AttackPoint, I've seen on other thread that you've been developing your script further but that was what I was able to find.
The map in question does not trigger LASTools limitations, but for other one I'm making for city sprint I had to limit tile size by hand. And stipes show up in most od the parking lots there too!

I understood how histogram is translated into terrain types this morning. As my excuse - I'm used to VHDL as a programming language and it is as strict as they come and PERL is so liberal that I got lost every few instructions because of shortcuts it allows.

An interesting fact - for the scripts to work at all I had to remove 2>nul redirection from every call to LASTools executables or they didn't work. Called by hand from console - no problem, but from PERL - failure every time.
Jun 22, 2018 7:41 AM # 
Terje Mathisen:
(I have been in private contact with Arashi, I'm going to take a look at this issue.)

Re. VHDL vs Perl:

My original start was in Fortran, then Simula, Basic, Modula II and Pascal before I wrote several megabytes of x86 asm, some of these are quite strict indeed. :-)

A little later I took up C(++) and Perl which are close to opposites in the laxness department.

Re. the stderr redirect: This is probably dependent on your version of perl, I have been using v5.24.0 recently.
Jun 22, 2018 8:36 AM # 
Terje Mathisen:
I've now looked at some of the raw LiDAR data Arashi sent me, a couple of LAZ blocks west of a lake: This is data from three heavily overlapped east<->west flight lines, but the green stripes (significant low vegetation) covers almost the entire area, with no obvious flight line correlation.

This unfortunately indicates that either the LiDAR is quite noisy, or that Poland, being so much more fecund than Norway, has a lot more low brush/vegetation in these forests.

If the latter, then a round or two of vegetation calibration will solve the problem:

Locate a number of typical vegetation samples, at least 2 or 3 for each type to be supported, and enter the coordinates in a text file "vegetasjon_facit.txt". On the next run of the vegetation classification program this file will be detected and instead of classifying the map, each of the sample plots will be located in the raw lidar and the calibration file updated with the height histogram values found there.

On any subsequent runs of the program, this calibration file will be used as the template for all areas on the map.

The calibration can of course be iterated, just update the file with new sample plots and/or remove any that turned out to be non-typical.
Jun 22, 2018 9:59 AM # 
So the discussion will be more substantial to others reading it, the areas I had problems with are a forrest/farmland area near Lucienskie lake!AksqIAYZ-BENjxmkIU-wolrIitD7
And a urban area in Warsaw:!AksqIAYZ-BENj362Ea7Vy2izUOgV
In the first case it is very likely there is undergrowth in the forest, but it should not be on fields, since LiDAR flight are supposed to be recorded in winter and fields should be empty at the time. But in the other (urban) case area is pretty much undergrowth free.
I'm looking for an possible issues with ground classification, because of the slope image script produces:!AksqIAYZ-BENjx-u7U-0_1rljZL1

Compared with relief produced by OL-laser!AksqIAYZ-BENkAZ0CIaz0GdmJkhZ

And it seems there is something deeply wrong with the first one.
If I look closely on the second one it look as if there is a grid beeing part of an image, I do not know if this is result of OL-laser or they are somehow embedded into lidar data. Maybe if they are, they mess with lasground and lasclassify? Trying to work this out...
I do not of course mean much less visible relief in OL-Laser output, I mean visibly more noisy rectangles in the script output, absent from OL-Laser png.

I am afraid that using classification will just mask the problem - but to be honest applied on the urban map worked like charm.

Other issue:
While testing the vegetation sample file I noticed there is a problem if one of the points is outside LIDAR range. It is just printed back into the file, so next run is also an update one and so on in an infinite loop. I changed script to printout a warning on stderr and drop the point entirely. Not the best action but works.

Re: stderr redirect
I have ActiveState PERL 5.24.3 installed on one machine, and Strawberry PERL on second one and both behave the same. Maybe something about different language version of Windows perhaps? This happened in the past for me, slightly different behaviour due to different character sets. It's not a problem, but rather an interesting fact.
Jun 22, 2018 10:45 PM # 
thank you both for sharing these comments.. I created a vegetasjon_facit.txt file, which contains about 40 or so lines (maybe that's too many?) of text which is similar to this:


But for some reason the batch file won't produce the vegetation.gif output file like I wanted. This is the command line output, from when it begins to process vegetation..

Updating vegetasjon_facit.txt
Total area: (5698250,2108750)-(5710500,2117000)
0.00 1.25 2.25 1.25 0.00
1.25 4.25 5.25 4.25 1.25
2.25 5.25 6.25 5.25 2.25
1.25 4.25 5.25 4.25 1.25
0.00 1.25 2.25 1.25 0.00

Creating final target image of (12251x8251) pixels, starting at (5698250,2108750)
1/1617 tiles_classified/tile_5698250_2108750.laz
....(bunch of other lines here)
1617/1617 tiles_classified/tile_5710250_2116750.laz
vegetasjon_facit.txt updated!
Usage: giffile|file_with_list_of_gif_files
Could Not Find C:\tmsw\tiles_classified\tile*.gif
Could Not Find C:\tmsw\tiles_classified\tile*.gfw


--- the problem I have is it keeps producing this output no matter how many times I run the batch file, leading me to believe there is a switch somewhere which incorrectly tries to update the vegetasjon_facit.txt file, even though it says it's clearly been updated at least once already. I've examined the "updated" vegetasjon_facit.txt file and there are no changes each time it gets updated (it looks very much like the above segment except there are 30-some more lines). Additionally, all the UTM coords in the file should be within the boundaries of the overall laz file. upon closer inspection in the "" file, it says this code:

my %facit = ();
my %class2mask = ();
my $usefacit = 0;
if (-f $FACIT_FILE) {
$usefacit = 1;
open(F, $FACIT_FILE);
while () {
my ($e, $n, @f) = split(/,/);
if ($f[0] eq 'M') {
for (my $i = 0; $i < 4; $i++) {
$SCALE_FACTOR[$i] = 100 / $f[$i+1];
$facit{scale($e).';'.scale($n)} = join(',',@f);
if (scalar(@f) < 5) {
$usefacit = -1; # Needs updates!
$CORES = 1;
printf(STDERR "%s $FACIT_FILE\n", $usefacit > 0 ? 'Using' : 'Updating');

later it uses the "usefacit" value to update or use that file, so apparently this part is the switch, which determines whether it is using or updating the "vegetasjon_facit.txt" file.

I guess I will have a closer look at the lines where it tests scalar... since that seems to be the switch
$facit{scale($e).';'.scale($n)} = join(',',@f);
if (scalar(@f) < 5)

Thanks to anyone who can help
Jun 23, 2018 5:50 AM # 
Hi Michael,
From my current experiments it seems like the update process is not working correctly for you, my question would be if you get an output at all.
This looks a lot like what I got, when I started learning how tu use Terje's scripts.

But to start with the vegetation file.
Update process means, the scripts should place in every line of vegetation file additional information on vegetation histogram for that point.
So for every line in vegetasjon_facit file it should produce output such as:


If your file does not change, then update fails and it will fail every time, because it still is not proper. That is what a switch you mentioned does - checks if the line has histogram data or not.

My current knowledge leads me to two possible options.
1) points are in different projection than LIDAR data, so they all fall outside the processed area (script takes tile by tile, than checks if any vegetation points are within the tile - if so tile is processed and output histogram recorded for this point. If a point does not fit into any tile it gets written back to vegetation file unchanged)

And this is highly probable! I just looked back at the output you pasted just before posting my reply and it says:
Total area: (5698250,2108750)-(5710500,2117000)
but the second coordinate of vegetation points is VERY far from those bounds
You should check this first!

2) your scrips do not work properly at some point as it was with me. Major reason beeing 2>nul redirects in every call to LAStools , making all those call from PERL to fail WITHOUT A WARNING (warning went all to nul as instructed), unless I added additional printouts and checks after the calls. Funny thing is, it does not work and does not tell you this because of redirection, but if redirection is removed it still won't tell you it fails, because then it works...

I hope Terje won't mind, but you might want to try a code I slightly modified and that works for me (it has some changes that might work worse than Terje's so be warned, but also has all my comments written to help me understand how they work you might find usefull. Some might be emotional I'm not sure if I removed all I vented my frustration thru :) )!AksqIAYZ-BENkCfE8FNrYn6UYvmm

But if you want to try make a copy of vegetation first!
I added an extra function, that check every line before writing it back and if it has no histogram data, just drops it (with a warning)
Jun 23, 2018 6:10 AM # 
Great. Yes, I was not expecting each tile to be the limiting factor but that does make sense. Thank you. And yes, I was wondering if it added histogram data, and because I didn't see any I didn't know it was missing.

And looking at the bounds of the tiles that you pointed out, hmm, maybe I need to get better or different formatted coordinates for the sample points, rather than deriving them from the open orienteering mapper. Thanks for pointing this out.

I will take a look at your code, more commenting is useful. My main goal is to attempt to better characterize/classify this lidar data for the purposes of vegetation mapping. And otherwise the scripts/batch files so far seem to be working okay as far as any nul checking goes, but maybe I will deal with the same error, as the first error which honestly, looks to be pretty basic too!

Thanks a lot! And thanks Terje too, it's great these tools.
Jun 23, 2018 6:29 AM # 
Good luck glad to have helped,

I'm wondering if I wrote everything correctly, english is a second language to me after all, tiles are not limiting factor, it's just that during update only relevant tiles are processed - most likely to save time.

If you get output, just without the calibration with the file then LAStools calls are probably working fine. I would really like to find out one day why I am so special so they do not work for me.

For my experiments I used some plugin in QGIS to copy coordinates in selected projection to clipboard. I didn't even know it's possible to get those directly from OOMapper. It might be a timesaver to do it directly so thank you for the tip!
Jun 23, 2018 7:49 AM # 
Terje Mathisen:
I will of course gladly incorporate any improvements to my toolkit, and you are absolutely correct that the facit file needs to have every single patch updated with histogram values, otherwise all subsequent runs will just try again to find/calculate those histograms.
Jun 26, 2018 4:27 PM # 
It is too early by very much for me tu suggest improvements to the toolkit I've only begun to understand. Just a thought that a warning in this place would help any new users to find a place to look for mistakes.

But your doubts about overlapping motivated me to do some experiments. I've found out, that while polish LiDAR data is provided in squares, all the points are assigned their scan number in the point_source field. Extracting points from single scan gives better results while using blast2dem to visualize ground points and first_returns so I started implementing different program for preprocessing - lasground and lasheight. My idea is to process each flightline separately up to classification in hope, that it will allow for better estimation of ground - each flightline will have ground estimated on it's own.

If someone already tried similar aproach I would be grateful for information how it went, I do not want to redo what someone already tried. Internet searches were not useful about this kind approach.
The difference between single flightline analysis might be seen on pictures I placed on my onedrive:!AksqIAYZ-BENkDS32x1l3jukMJiq

There is a file with point density, and in the area there are three N-S flights and three W-E flights, and pictures were generated for single W-E flight, three overlapping W-E flights and all overlapping flights. It can be seen that in the overlapping areas DEM quality is degraded, edges beeing less sharp and flat surfaces not looking sharp anymore.

All pictures were generated by blast2dem from same set of data (DSM with -first_only parameter and DTM with -keep_class 2). "single flightline" were additionally filtered with "-keep_point_source" parameter.

I've placed my script on 1drv as well. The classification part is really simple, just calls to LAStools, unlike more sophisticated Terje's approach.
Jun 27, 2018 6:17 AM # 
Terje Mathisen:
Arashi, this is interesting. You should definitely report your findings to your government mapping authority since this is a clear indication of at least somewhat faulty flight path calibration.

Please login to add a message.