Splitting Fortran Files into Subroutines

Coding in fortran, I often find it easier to work with numerous files each containing a single subroutine.  This can make it easier to compare two versions of files in which order of subroutines may have changed (since each subroutine is compared instead of the files all at once), it makes compiling faster as less code needs to be recompiled on every edit, it makes it easier to use a text editor, since you can see a greater fraction of the file at once.  It’s not all good though, sometimes it’s nice to have subroutine clustered together in a module.  But on the whole, I’d rather deal with 100 duck sized horses.  So the question is then, how can I quickly and easily go from one giganto file to many one subroutine files.  Answer, a couple of options:

1) f77split – This was probably the first one I tried, works pretty well.

2) f90split – For newer fortran.

3) Homebrewed: Fair warning, haven’t used this in a little bit, but it was useful the last time I did, and I believe this is the most recent version.  Also, it’s not pretty, it’s not really supposed to be.  It’s supposed to accomplish a fairly simple task as simply and with as little effort as possible while being theoretically simple.

#!/usr/bin/perl
use strict;

my $filename = $ARGV[0];
chomp($filename);
my $basename = $filename;
$basename =~ s/(.*)\..*/$1/; # chop off until the end of the line

system("mkdir -p $basename"); # make the file for one
my $current_subroutine_title = '';
my $file_open = 0;

open SOURCEFILE, "<", "$filename" or die $!;

while(my $fileline = )
{
chomp($fileline);
# print "$fileline\n";
if($fileline =~ m/^\s+subroutine\s+(\w+)/i)
{
print "now reading subroutine $1\n";
if($file_open > 0){close(SUBROUTINEFILE);}
open SUBROUTINEFILE, ">", "$basename/$1.F" or die $!;
$file_open = 1;
}
if($file_open > 0)
{
print SUBROUTINEFILE "$fileline\n";
}
}

if($file_open > 0){close(SUBROUTINEFILE);}
close(SOURCEFILE);

Lookup Tables as a Code Optimization

First and foremost, you should generally push off optimizing code until the end and instead be thinking about a coherent and simple design during most of your coding. This approach will often allow you to avoid duplicating (or triplicating, quadrupling…) effort or allow you to bypass certain computations entirely. However, if at the end of the day you find you’ve done all this and you would still like to accelerate your program lookup tables are one thing to try. They boil down to, it can be faster to retrieve a value stored in memory than it is to compute the value. So if you need to compute a given function 10^5 times on the range [0-1] you might be able to compute the function at intervals of 1/10^3 and then use the nearest value stored in the table instead of computing every time. Of course you’re losing accuracy, but you gain speed, and if you are willing to tolerate this tiny bit of error you might do a lot better. You could also think about doing some sort of linear interpolation between values in the lookup table which will decrease your error significantly, but again at the cost of speed.

Here’s a little program that allows you to change the operation in only one place. You can also change the size of the lookup table and the number of computations (if you’re doing fewer than the number of entries in the table you’re always better computing directly).

#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <math.h>
#include <vector>

using namespace::std;

double my_rand();
double my_operation(double input);
void init_sqrt_table(vector <int> & sqrt_table, double & step);

#ifdef FAST
  double my_sqrt(double input, vector <int> & sqrt_table, int & num_steps);
#else
  double my_sqrt(double input);
#endif

int main()
{
  time_t start,end;
  double execution_time, tmp, maxval;
  vector  sqrt_table(1000);
  double step = 0;
  int num_steps;
  time (&start);

#ifdef FAST
  init_sqrt_table(sqrt_table, step);
#endif

  srand (1);
  maxval = 0;
  num_steps = sqrt_table.size();
  for(int i = 0; i < num_steps*100000; i++)
  {
    tmp = my_rand();
#ifdef FAST
    tmp = my_sqrt(tmp, sqrt_table, num_steps); // gets it from a lookup table
#else
    tmp = my_sqrt(tmp); // just calls c sqrt
#endif

    if(maxval < tmp)
    {
      maxval = tmp;
    }
  }

  cout << maxval << endl;
  time (&end);
  execution_time = difftime (end,start);
  cout << "Program execution time: " << execution_time << endl;
  return 0;
}

double my_rand()
{
  return rand()*1.0/(RAND_MAX + 0.0);
}

void init_sqrt_table(vector <int> & sqrt_table, double & step)
{
  cout << "filling sqrt table" << endl;
  step = 1.0 / sqrt_table.size();
  for(int i = 0; i < sqrt_table.size(); ++i)
  {
    sqrt_table[i] = my_operation(i * step);
  }
}

#ifdef FAST
double my_sqrt(double input, vector <int> & sqrt_table, int & num_steps)
{
  return sqrt_table[rint(input*num_steps)];
}
#else
double my_sqrt(double input)
{
  return my_operation(input);
}
#endif

double my_operation(double input)
{
  return exp(sqrt(exp(input)));
}

Pattern Noise Reduction

This little algorithm seems to significantly reduce pattern noise in digital images.  Instead of doing dark frame subtraction, the output value at each pixel is taken to be the average of the initial value with weight (1-dark) and the value of pixel in a gaussian blurred image with weighting of (dark).  Where dark is an element of [0,1].

I wrote this because I wanted something I could apply to webcam images both night and day that would eliminate hot pixels in night shots and would not create dark spots in day shots. Seems to work pretty well.

This is pseudo perl, and calls the imagemagick library, but you get the idea.

# pseudo-code creative commons – attribution – share alike
# author Joseph Bylund
# i.e. please copy, please give to your friends, please modify, please give me credit
use strict;
my $corrections = “pixel_corrections.png”;
my $mydark = “./dark_frames/dark_frame_avg.png”;
my $start = “webcam_pic.png”;
my $lessdark = “lessdark.png”;
my $startbydark = “startbydark.png”;
my $radius = 4;
system(”
convert -compose multiply -composite $mydark $start $startbydark
convert -evaluate Multiply 2.54 $startbydark $startbydark
convert -compose minus -composite $startbydark $start $lessdark
convert -gaussian-blur 20x$radius $lessdark $lessdark\_blur.png
convert -compose multiply -composite $mydark $lessdark\_blur.png $corrections
convert -evaluate Multiply 2.54 $corrections $corrections
convert -compose plus -composite $lessdark $corrections final.png”);

Free NEF to jpg Conversion

As in free software of course, but coincidentally it doesn’t cost you anything either. This conversion takes care of a few things,
1) raw -> jpg, the most basic aspect, we want to turn our nef files into jpg files
2) noise correction – as a function of iso, I haven’t played with the amount of noise correction yet, I just went with a function that starts at 100 at iso 100 and ends up at 1000 at iso 1600, which is the range of my camera, and according to the documentation of dcraw a normal range of noise corrections
3) it crops some pixels, my camera has some goofy edge effects, so I just cut them off
4) it preserves exif data, to me this is a pretty big one
5) it’s embarrassingly parallelized, i.e. it processes each image individually. And it will use one fewer cpu than your machine has. For instance I have an i7, with 8 cpus (after hyperthreading) so it uses 7, leaving me one to have a responsive computer during the time it’s processing images. This has the nice upside of significantly speeding up conversion, as I get approximately a 7x speedup, so processing a few hundred photos takes on the order of minutes instead of on the order of hours (1 hours 10 minutes -> 10 minutes).
6) white balance – I’m not sold on the white balance correction in dcraw, but I think it’s better than nothing. I sort of like that of the gimp, but I find that it sometimes introduces color casts. I think long term I will be moving towards camera white balance.

What it doesn’t do (yet)
1) geotagging – I think I’ve got this figured out, detailed a few posts back.
2) auto vignette correction by lens and focal length – seems like it should be pretty possible to do, I just haven’t done it yet. It would involve making a directory of averaged images at each focal length and then doing some sort of image magic compose divide. The biggest hassle would invariably be making the correction images.
3) chromatic aberration correction by lens and focal length. In a way easier, as one only needs to determine the multipliers for the different channels at different focal lengths and lenses. As opposed to having a collection of correction masks as for vignette correction. But finding the multipliers might be more difficult. I think it’s going to involve taking a picture of the sky with bars (buildings perhaps) on the edges of the frame. Essentially I’m looking for an instance that would create large chromatic aberrations. Then I can do a decompose in gimp and count pixels. Though I’m expecting the correction to be only 1-4 pixels over nearly 4000 pixels horizontal. Anyways, eventually I may try it and see what effect it has.

And finally the scripts:

#!/usr/bin/perl
#image_converter.pl
use strict;
my $numcpus = `grep -c processor /proc/cpuinfo`;
chomp($numcpus);
$numcpus -= 1; #save one processor for doing other stuffs
print "using $numcpus processors\n";
system("ls *.NEF *.nef 2>/dev/null|xargs -n 1 -P $numcpus rename 'y/A-Z/a-z/'");
system("ls *.nef|xargs -n 1 -P $numcpus chmod -x");
system("ls *.nef|xargs -n 1 -P $numcpus ~/bin/image_converter_inner.pl");
exit(0);

#!/usr/bin/perl
use strict;
my $numcpus = `grep -c processor /proc/cpuinfo`;
chomp($numcpus);
$numcpus -= 1; #save one processor for doing other stuffs
my $image = $ARGV[0];
$image =~ s/.nef//;
my $iso = `exiftool -ISO $image.nef|awk '{print \$3}'`;
chomp($iso);
my $waveletcorrect = log($iso/100)/log(2);
$waveletcorrect = 100*3.162^$waveletcorrect;
system("dcraw -t 0 -a -n $waveletcorrect $image.nef");
system("convert $image.ppm $image.jpg");
system("/bin/rm $image.ppm");
system("exiftool -q -overwrite_original -tagsFromFile $image.nef $image.jpg");
my $width = `identify -format "%[fx:w]" $image.jpg`;
my $height = `identify -format "%[fx:h]" $image.jpg`;
chomp($width);
chomp($height);
$width -= 4;
$height -= 4;
my $cmd = "convert -crop $width\\x$height+0+0 +repage $image.jpg $image.jpg.tmp; mv $image.jpg.tmp $image.jpg;";
system("$cmd");
exit(0);

Intel ifort flags

Update: I’d really recommend gfortran over ifort. Intel fortran compiler was always a pain to install, the documentation seems kind of buried, the restrictions on use are servere, and I never got much of a performance benefit. I guess I’d be happy to do something similar for gfortran flags.

The intel ifort compiler is capable of generating code for automatic cpu dispatch which optimizes code for the processor architecture. Here are some of the flags for that little bit of goodness:

-ax generate code specialized for processors specified by
while also generating generic IA-32 instructions.
includes one or more of the following characters:
K Intel Pentium III and compatible Intel processors
W Intel Pentium 4 and compatible Intel processors
N Intel Pentium 4 and compatible Intel processors. Enables new
optimizations in addition to Intel processor-specific optimizations
P Intel Core processor family with Streaming SIMD
Extensions 3 (SSE3) instruction support
T Intel Core2 processor family with SSSE3
S Future Intel processors supporting SSE4 Vectorizing Compiler and
Media Accelerator instructions

And in my opinion the critical flag is to turn off the debugging information, and in order to do that the flag is: -diag-disable cpu-dispatch

Intel of course keeps updating, so the most recent version here: http://software.intel.com/en-us/articles/intel-fortran-composer-xe-documentation/#lin