08.17.11

Is the PE Attribute Certificate Table octoword-aligned or octobyte-aligned?

Posted in Uncategorized at 7:59 pm by david

Looking at the Microsoft Portable Executable and Commmon Object File Specification rev 8.2 (Sept 21, 2010)which is the definition of the PE file formatI’m confused about the alignment of the entries in the Attribute Certificate Table.

  • Page 58, section 4.7: “The attribute certificate table is composed of a set of contiguous, octaword-aligned attribute certificate entries.” 
  • Page 59, first paragraph: “Subsequent entries are accessed by advancing that entry’s dwLength bytes, rounded up to an 8-byte multiple, …”
  • Page 59, algorithm step 2: “Round the value from step 1 up to the nearest 8-byte multiple …”
  • Page 59, algorithm step 3: “… and round up to the nearest 8-byte multiple …”
  • Page 60, last paragraph before the bullets: “If the bCertificate does not end on an octaword boundary, the attribute certificate table is padded with zeros, from the end of the bCertificate to the octaword boundary.”

So the documentation is confused.  But, it also clearly says, on page 59, “If the sum of the rounded dwLength values does not equal the Size value, then either the attribute certificate table or the Size field is corrupted.”  And on my sample, signed, executable, the Size field is a multiple of 8 but not 16, and WinVerifyTrust() says that the executable is authentic (and of course the loader will load and execute it).

So on the basis of this experimental evidence (one sample) I think we can conclude that the Attribute Certificate Table is octobyte aligned, not octoword aligned.

07.11.11

An example of documenting a broken feature as normal behavior

Posted in Uncategorized at 1:19 pm by david

Here’s an example of documenting a broken feature as normal behavior - furthermore, hiding it in the documentation as a “remark”.

The issue is the SQL Server “TOP” feature to return the “top” rows of a query result.  Consider the MSDN documentation here.  The “TOP” keyword in a SELECT statement is used to return only the first N rows of a query, for some N (or alternatively, N%).

Naturally—and I really mean, of course, there’s no other reasonable way for it to work—if the SELECT query includes an ORDER BY clause then it is respected and the sort occurs before the “top N” rows are selected.

Now for some reason they let you put the TOP keyword in an INSERT, UPDATE, or DELETE statement after the main keyword.  But in these cases, the ORDER BY clause in the subordinate SELECT is ignored!

E.g., given
INSERT TOP (2) INTO Table2 (ColumnB)
SELECT ColumnA FROM Table1
ORDER BY ColumnA;

an arbitrary two rows are inserted, not the first two rows returned after the sort! That’s REMARKable! But documented!

(There’s a workaround they show you: Put the TOP keyword after the SELECT keyword, not after the INSERT keyword. But why allow the other syntax to be at all if it leads to such obviously broken behavior?  Alternatively: Why not fix it if you’re going to continue allowing that syntax?)

03.26.11

An Unbounded Spigot Algorithm for Pi - the Java version

Posted in Uncategorized at 5:05 pm by david

Some time ago I read Jeremy Gibbons’ article “Unbounded Spigot Algorithms for the Digits of Pi” and liked it a lot- the problem was amusing, and the article was easy to underestand.

A spigot algorithm is an algorithm for producing digits of an unbounded sequence without reusing digits after they have been computed.  The digits are produced “one by one, as if from a leaky tap” (Gibbons).  An unbounded spigot algorithm doesn’t need to have some predetermined number of digits to run on—it’ll just keep going and going until you run out of memory.

I was recently experimenting with programming an FPGA and thought of this paper; my idea was that a little coprocessor (the FPGA) hanging off my server could just start producing digits of pi and keep running indefinitely.  Even though it isn’t the fastest way go generate digits of pi (not by a longshot) it still seemed like fun.  And I thought of a nice “agent” architecture where I’d take the digits and run them past an array of recognizers, each one to do some kind of statistics on the digits of pi, or look for special sequences like ten digits in a row made of one each of the ten digits.  Those recognizers would all run in hardware, in parallel.

I remembered there was a “catch” about spigot algorithms, so I reread the paper.  And the catch is that you obviously can’t generate digits of pi, an irrational (also transcendental of course), from a finite process.  So even though the algorithm can be easily specified and for sure will generate one digit at a time, the implementation might need to use arbitrary amounts of memory to produce that digit.  In fact, the memory complexity is hidden inside the bignum rational arithmetic that the algorithm uses.

Now, this wasn’t necessarily a stopper.  I thought that maybe the problem was that as digits were generated the bignum rational arithmetic was generally done on reasonably sized bignums (maybe several thousand bits of numerator or denominator), and only every once in awhile did you have a few terms which blew up to much larger sizes.  In that case you could mainly run the algorithm on the FPGA but which, when it discovered a multiplication that resulted in a too-large-for-the-hardware result, would trap back to the server which would run that particular multiplication with much more resources available, and would then pass the result back to the FPGA, which could continue.

But no, that wasn’t to be.  I experimented with an implementation in Java and discovered that the bignums get larger and larger from the very beginning, and even if you not only reduce each bignum to smallest terms, but also do the same for the 2×2 matrix that the algorithm uses, there’s no use.  You’re doing really big bignum arithmetic after the first few terms.

Anyway.  I was happy to see that the Haskell algorithms in the paper transferred very easily and very directly to Java (once I supplied a BigRational class).  Here’s the code: spigot.tar 

One curious bug kept me puzzled for hours: I got bit by a 32-bit integer overflow.  Turns out if you’re computing 3(3i+1)(3i+2)(5i-2) for i from 1 to ∞ … well, that simple expression overflows along around i = 280.  I didn’t expect it at all, and looked everywhere else, in my bignum-using expressions for it.

At this time I’d like to point out a couple of typographical errors in the programs in the Gibbons’ paper cited above, “An Unbounded Spigot Algorithm for Pi”, so you can save some time if you want to investigate this algorithm yourself.

  • pg 7, definition of extr—the / should be %
  • more crucially, pg 9, definition of next+15 should be -12, also in conjecture 1, definition of n

Object-Oriented Permutation Generation

Posted in Uncategorized at 2:39 pm by david

A classic interview coding question is—or at least used to be—to write a program to generate all the permutations of a string.

I used to like to get it because it had a simple and elegant recursive solution that was easy to right the first time, when writing it on the whiteboard. And coming up with a good recursive solution in an interview used to impress interviewers, especially if you explained it nicely in a “mathematical induction” sort of way.

Recently I had two additional thoughts about that easy solution. The first was that recursive solutions were sort of the epitome of procedural programming, but what would be the object-oriented variant, and would it be as easy to write it correctly at the whiteboard?

The second thought was a bit more interesting: suppose you actually needed a permutation generator in your program (I never have) then you probably would not want the recursive solution anyway, which would generate all permutations—and process them—before returning to the caller. Instead, you’d want a demand-oriented, that is, iterator-style, solution that you could pull solutions out of when you wanted to. That sort of solution would be more difficult to express in a procedural language, and you couldn’t easily change a recursive solution into an demand-oriented solution, but the object-oriented solution might be easier to adapt. And could you still do it at the whiteboard?

I was going to make a blog post about this—but it turned into quite the megilla and I put it on CodeProject as a tutorial article, here.

In that article I not only show the object-oriented data-pipeline version that corresponds directly to the procedural (recursive) solution, but I also show two variants of it in “pull-style” as a iterator.

However, here is the object-oriented solution to generating permutations. It is in fact as simple as the recursive solution, and perhaps even easier to get right the first time while standing at the whiteboard. I don’t know why it hasn’t been seen, as an example, earlier.

private static final class PermuteStep implements Action {
    private final Action a;
    private final char c;
    private PermuteStep(char c, Action a) {
        this.c = c;
        this.a = a;
    }
    public void next(String s) {
        for (int i = 0; i <= s.length(); i++) {
            String n = s.substring(0, i)
                     + c
                     + s.substring(i, s.length());
            a.next(n);
        }
    }
}    

public static void Permute(Action a, String p) {
     // (Argument validation elided for clarity)
    Action as = a;
    for (char c : p.toCharArray()) {
        as = new PermuteStep(c, as);
    }
    as.next("");
}

04.13.09

A couple of my articles on CodeProject

Posted in Uncategorized at 6:29 pm by david

Long time no update this blog.  I intend to fix that.  But for now, I’d like to point to two articles I posted at CodeProject:

  • Tracing Events Raised by Any C# Object—in which I describe a technique for tracing the events of any C# object using a very simple helper class, using .NET Reflection to get the event handlers of an arbitrary object.
  • Password Field Unhider (and some C++ utility classes)—first I present a small utility that lives in the Windows notification area and stands ready at any time to unhide (that is, unmask) any password field on the screen, so you can see what you’re typing. And second, I describe some very simple yet useful C++ utility classes: a general message pump, an IPC mechanism using WM_COPYDATA, and a work item dispatcher.

I intend to post more articles at CodeProject, the kind of useful tips, tutorial, explanation things, with source code, that are longer than the typical blog post.

11.11.07

Typing Mathematical Formulas in Word 2007

Posted in Uncategorized at 5:43 pm by david

You can use Word 2007 features to generate very nice looking mathematical notation. This feature is, for all practical purposes, completely undocumented by Microsoft. However, some information has been published by Microsoft employees and others on the web. This post is meant to serve as a convenient directory of that information. (This post will be updated as I learn more about equations in Word 2007.)

 
 

Interesting, but not as practical:

General places to look for information: