Accessing the Internet from a VPN connection

So you’ve got VPN working and people are able to access LAN resources remotely via VPN.

But they’re not able to access the Internet via VPN.  This isn’t usually a problem since users can always use their local internet access to access the internet.  After all, this is how they’re getting to the VPN.

There are situations where they’d rather access the internet via the VPN connection.  For instance, maybe their internet access allows VPN connections but blocks access to their favorite news site.  Or their favorite search engine.

If you’re providing VPN access using a SOHO firewall/router combo device then there is a good chance that the device will not support providing internet access to its VPN clients.  Emphasis on “will” since this restriction is an optional restriction mainly aimed at getting you to buy a router.  It’s usually worded along the lines of something like “this device won’t transmit packets received on an interface back out that interface”.

Fair enough, it *is* a routing function and businesses have every right to differentiate their products as they see fit.

If you’re not interested in, or are unable to, purchase a router then you can use a proxy server to provide access to the internet for remote VPN clients.  Proxy servers are cheap and setting them up is easy.

The major downside to this approach is that clients have to configure their applications to use the proxy server.  Many networked applications have support for this (e.g., browsers) but the configuration is slightly different for each application.  And users have to remember to turn it on and off as their connection changes.  But in a pinch this will do the trick.

Is this image lined up?

A problem frequently encountered in image processing is that of determining if an image is oriented properly.  Sometimes this question is so difficult to answer that computer people, like their math people cousins, solve it by redefining the solution and solving for the redefinition.

In this case, instead of answering the question “Is this image properly oriented” we answer the question “Is this image aligned to some other image?”  We’ll assume that “some other image” *is* oriented properly.  So if we can line up our image with canonical image then we’re good to go.

There are a few techniques that are brought to bear.  One is a frequency based technique that exploits the fact that the product of 2 functions is maximal when they’re perfectly aligned.  This technique, convolution, is excellently visualized in this Wikipedia entry.

When it comes to images each image can be considered a 2 dimensional function (f(x, y) = z where x and y are the coordinates of any given pixel) defined over some finite interval.  One can visualize sliding a 2D function over another 2D function by imagining a multicolored blanket sliding over another blanket.  It can be moved in either or both of 2 directions: x and y.

The finite interval part is important.  In the case of an image the finite intervals are the dimensions of the image.  We assume it’s 0 everywhere else so that the product of the function outside of its dimensions is 0.

Since this is a 2D function, unlike the 1D function depicted in the Wikipedia article, their product produces a surface.  The area under this surface will be maximal when they are perfectly aligned (assuming the images are identical).

A coworker has just explained a different, spatial technique for determining image alignment (registration in the jargon of the trade).  In the spatial domain if 2 images are identical and perfectly lined up then if you were to subtract one from the other at each pixel location you would get 0s at every location.  To account for variations in magnitude the square of the difference at each point can be taken.  The sum of these squared differences will be zero when the images are perfectly aligned.

If the images are identical but not perfectly aligned you can figure out how to align them but sliding one around the other and examining the sum-of-squared differences (SSD).  This too can be plotted as a surface the minimum of which represents the amount one image needs to be shifted in x and y to line up perfectly with the other.

There’s a lot assumption-wise that I’ve left out of the discussion.  The technique assumes that the image has regular features.  And that there are strong/sharp features (in frequency space these are represented by high frequencies) that will tend to dominate the SSD such that when the images are out of alignment the SSD will be large vs near zero when in alignment.  If the image were uniform noise then the SSD is likely to bounce around with no clear minimum as there’s no sharp edge content to anchor the sum.

Domains and Impersonation

What happens when your Windows Service tries to impersonate a local user while joined to a Domain? 

Does “.” still represent the local machine or does it represent the default domain?

To find out the answers to these questions I fired up Virtual PC 2007 and installed Windows Server 2003 R2.  Normally I’d have gone with Server 2008 but suspect that the target environment is running 2003.

First read this Wikipedia article on Windows Domains then follow this excellent tutorial for setting up Active Directory.  Why Active Directory (AD)? AD is basically the primary database for Windows Domains.  Even though it’s technically a directory service, not a traditional RDBMS.

So, to answer the original questions, Impersonation works just fine whether or not the computer is joined to a domain.  And “.” still means local machine.  Yippeee!

Generalist’s Delight: Impersonation, UNC, NFS and Virtual Machines

Recently got a chance to exercise some of the technical muscles us generalists love to preen.

The basic problem: A windows service that writes files to a local directory needs to be able to write files to a directory on a Unix system.  Quickly (as in we have a day or two at most to get this to work).

Caveats

  1. The windows service runs as LocalSystem which cannot access resources over the network.  It has to (as far as I know) run as LocalSystem because it needs interactive desktop access.
  2. FTP and Samba are not available due to local policy.

But we can’t even SEE uNIX directories

Since the remote host is running some flavor of Unix I expected we’d have to use NFS.  Our Windows builds don’t include NFS support. 

Fortunately Microsoft gives away a Windows add-on, Windows Services for Unix (SFU), that allows Windows to access NFS exports.  After a little setup that is.  In a non-NIS environment authentication is handled locally.  So the client system (the one mounting the NFS exported directory) needs its own copy of the user name, user id, password and group id of the account it will use to authenticate access to the Unix system.

In the Unix world that would have been the end of the story.  You’d pass the info on the command-line when you access the remote directory.  Fortunately there’s a wonderful GUI in SFU (PCNFSD) that lets you map Windows accounts to locally defined Unix user IDs.  Once mapped NFS mounts can be accessed in the Windows familiar UNC format (\\server\export) or NFS format (server://export).

Impersonation

Now that our Windows machine can “see” the remote directory we’ve got to modify our service so that it can write to it.  From my web programming days I remembered that a process can impersonate another user.  In a web context this is usually done when the app server process (worker process) needs to do something on behalf of the currently connected user; something the account under which the worker process executes does not have the privs to do.

Since this is a .Net service, platform invoke is necessary to access the LogonUser() and related Win32 APIs.  Oddly enough System.Security.Principal has a class that wraps the impersonation API call but does not wrap the functionality necessary to acquire the security token required by impersonation.

But we don’t speak Unix here

We don’t have any systems running unix but fortunately, as described in this previous post, I’ve still got a Virtual PC vhd of  RedHat Fedora 11.  This will do for testing purposes.  It’s painfully slow, since I can’t seem to get Fedora to boot in Virtual PC 2007sp? with hardware virtualization support enabled (heck, I was suprised that my laptop even supports hardware virtualization).  I’m sure a VMware appliance would run faster but I didn’t have that on-hand and time was short.

After booting into Fedora I’m pleasantly surprised by all of the things they’ve copied from Windows.  I expected to have to play around with /etc/fstab then bounce nfsd manually but all of this configuration can be done via UI these days.  After creating a test user, exporting a directory in the test user’s home directory and noting the test user’s user ID and group ID we’re off to plug these into our PCNFSD account mapper and start testing.

But We Can’t Read Maps

Managed/.Net processes can apparently only see drives mapped by the user under which the process is running (maybe this limitation isn’t specific to managed processes?)  So using a persistent mapped drive, which SFU doesn’t support anyway, isn’t an option.  Fortunately UNC syntax works (albeit slowly the very first access).

Tying it all together

An ls –al provides that feeling of satisfaction as the recently created file shows up in the listing.  In pretty colors no less.  Woohoo!

D3DX – easing the path to 2D Direct3D9

Here I’m thinking I have to create my own vertex data structure and set (or mask) the FVF (flexible vertex format) flags but it turns out that D3DX, that most wonderful of utilty libraries on top of Direct3D, has predefined several useful structs.

In my case, I’m only interested in drawing 2D shapes so D3DXVERTEX2 fits the bill.  Always nice to find a data structure that fits with the philosophy of using the least complicated data structure that’ll get the job done.  It exposes X and Y which is mostly all that I need.

2D drawing with D3DX is not unlike drawing with GDI primitives.  In GDI you select a pen into a device context, set properties of the pen then draw onto the device context with various primitives possibly changing properties as you go to achieve a different result (e.g., color, line width, etc..).

The D3DX analogue is a line (ID3DXLine).  This is created directly on a Direct3D device (which abstracts the underlying hardware in a way similar to the way a Device Context abstracts the underlying hardware in GDI).  Drawing is accomplished by setting various line properties then passing in an array of vertices representing the desired line segments.

ActiveX control method name changing case out-of-the-blue?

I’m humming along working on an application when suddenly Visual Studio reports that it can’t find a certain property.

The property name, let’s call it SomeProperty, is a property of an ActiveX control that gets referenced from a Windows Forms 2.0 application.

So I do a clean or three assuming that an old AxInterop dll is lying around somewhere.  After the 2nd clean I blow away a few more library directories to guarantee that the control is both being created and imported from scratch.

Still no luck.  So I clean out my temp directory since Visual Studio uses the temp directory for some intermediate files and who knows, maybe a file somehow got locked or had its permissions changed.

Same error.  I pull up OleView to verify that SomeProperty really has become someProperty.  Sure enough, it shows up as someProperty.

It turns out that this is a known issue with the IDL compiler.  It uses the case of the first instance of an identifier that occurs in the IDL file for any subsequent occurrences of that identifier even when the identifier is later used in a totally different context!

In this case, I recently added a struct with a member name that happens to collide with a property of an interface defined later in the file.  Who knew that all identifiers needed to be globally unique? 

I’ll have to give the mapping thing mentioned in the KB article a try but for now I’ll leave a comment in the IDL and rename the struct member.

Designing for the future

One of the surest ways to create designs that aren’t easily reused is to try to anticipate the domain-specific data structures that will be needed and create them in advance.  Like so much design advice this requires some caution to apply.

The domain-specific part is crucial.  General purpose data structures are very reusable.  One general purpose data structure is a list.  A domain specific data structure is a list of Employees.  How do you know when a strongly typed list of Employees is going to be reusable?  One clue is that you yourself find that you need it.  Another clue is that you keep running across code where devs create collections (or arrays) of Employees and manipulate them in some way.  If you don’t find yourself in need of a data structure and you can’t find evidence of other devs needing/using that data structure then it is probably too early to create that data structure. 

Since most of us are terrible at predicting the future the odds of getting something like a domain-specific data structure right are low.  Since we’re more likely than not to get it wrong our prematurely created data structure might get in the way of discovering the right data structure.

I find myself most able to re-use components that are highly cohesive and loosely coupled.  High cohesion means that a given method does only a single thing at the level of abstraction appropriate to its context.  Loose coupling reduces the number of dependencies between a given method or object and other methods or objects.

In practice, loose coupling is mainly about choosing what to parameterize.  For any non-trivial method it’s usually too expensive to parameterize every thing.  If a method can usefully perform its highly cohesive functionality with only its input parameters then that method is loosely coupled.  These are great candidates for public entry points because they don’t make much in the way of assumptions about conditions.  Every method doesn’t need to be extremely loosely coupled but those that are tend to be more easily reused then those that aren’t.  Of course this has to be weighed against readability – a method that takes 500 parameters might be extremely reusable but it won’t be reused much because it’s too much work to use.

For example, a method that sorts a list of strings tends to be highly reusable when working with lists of strings.  A method that takes a list of strings, tokenizes them, executes them in a shell, collects the output and summarizes the execution results will tend to be less reusable except in a very specific context.

The second method is less reusable because it does so many things.  I’m less likely to be able to compose functionality from that method because it does things I don’t want it to do.  The fewer things a function does the less likely it is to do something I don’t want and, consequently, the more likely it is to be reused.