Computer Troubleshooting - specific problems

.. Geoff, Feb10/09

How to report a problem
Vendor computer support
Other Computer Resources

Related pages:
Main troubleshooting page

Apple iPhone reset
Programs don't start
Scheduled task could not start
Branch office server failure
Laser printer isn't printing
Firefox cannot see parts of Intranet
Frontier intranet stops responding
Foxpro won't start
Deere Service Advisor problems
Internet Explorer information bar pops inappropriately
Microsoft HTML help problems
Adobe Acrobat reader in Internet Explorer
Kohler Power Plus
HotmetalPro crashes
Deere ECU programming problems
Kubota K-ISS issues
"error reading information from netmap.inf" when installing Command Console
Defrag Commander reports a disk error
Print jobs get reprinted again and again
Frontier Intranet does not show costing
Symantec Anti-Virus
Kubota KIEPS parts books
Outlook Express (Internet email)
Basler DCG1000
Printing a Windows Screen
HP JetDirect print server
Inter-branch connection down
Inter-branch connection unstable

Ways to get data off a failed computer
Recovering unbootable systems
How To Troubleshoot Any Networking Problem

How to Report a Problem

Specific program errors:
Apple iPhone reset Sept 29/08

If iPhone unresponsive:
- hold the Home button below the screen for at least six seconds, until the application you were using quits.
- If that doesn’t work, turn iPhone off and turn it on again. Press and hold the Sleep/Wake button on top of iPhone for a few seconds until a red slider appears, and then drag the slider. Then press and hold the Sleep/Wake button until the Apple logo appears.
- If that doesn’t work, reset iPhone. Press and hold both the Sleep/Wake button and the Home button for at least ten seconds, until the Apple logo appears.

Programs don't start Jan 13/07

Problem: Various program don't start.

Cause: Software conflicts

Solution: Run the MSCONFIG diagnostic. Click "Services", "hide all Microsoft services", "disable all". Then restart. This starts the machine in a stripped down mode. If the program now runs, then you can figure out where the conflict is. If the program works, try enabling half the list. Then the other half, and slowly figure it out.

Scheduled task could not start Jan 8/07

Problem: Scheduled task shows "Could not start" in status, and it doesn't run

Cause: Sloppy programming on Microsoft's part. Using DCPROMO on a domain controller messes up the user account that the scheduled task runs under.

Solution: http://support.microsoft.com/kb/246183/en-us
- delete the task
- delete files like d42*.* in "C:\Documents and Settings\All Users\Application Data\Microsoft\Crypto\RSA\S-1-5-18" folder
- recreate the task

Branch office server failure Sept 23/07

Problems: DHCP - After 24 hours, the DHCP leases will expire on all the branch computers, and all computer in the branch will lose their ability to use any network or Internet resources.
- Nobody will be able to print, as all printing is routed through the branch server
- User files will not be backed up to the local server

Solutions: Enable DHCP server on the Sonicwall. Network > DHCP Server. Enable conflict detection. Create a new dynamic range from (in Calgary 172.16.3.200 to 172.16.3.240. Gateway is the Sonicwall's local address. Lease time is 1440 minutes (1 day). Domain name is "frontier.internal" (matches our AD directory). DNS servers are Deserver, the failed branch DNS server, and DeBackup's IP addresses. Note that this works only because we allow 'secure and insecure updates' in DeServer's DNS. All the branch user computers appear to get IP addresses within 5 minutes of enabling the DHCP server on the Sonicwall.

- if the failure is in PT branch, which does not have a Sonicwall, you can enable the DHCP server on the 3com VPN box. However, you don't have the ability to specify DNS addresses, so this might not work so well as using the Sonicwalls.

- then, diagnose the problem. We currently use Discovery Computers in Calgary, and Byte Track in Edmonton to help us. Get Dell tech support involved to see if you can fix the problem quickly. If we cannot fix the problem that day, get a spare server out there so people can print.

Laser printer isn't printing

Problem: The laser printer isn't printing.

Cause: All print jobs are routed from the user's computer -> local server -> HP Jetdirect print server -> laser printer. Somewhere along this chain there is a problem. Sometimes a print job get corrupted and "stuck" on the server

If the server reboots and the your workstation is not rebooted, then next print job will probably be corrupted and cause all printing to stop.

Solution #1: First place to start is the printer. Check to make sure it is "online", and no lights are flashing, no paper is stuck. You could try turning the printer off, wait 5 seconds, then turn it back on again.

Solution #2--if there is an external print server: Turn off (or unplug) the HP Jetdirect print server for 10 seconds, then turn it back on. There is a "Print Test Page" button on it. Press it. If that doesn't work, turn it off, wait 5 seconds, then turn it back on again.

Solution #3: Check the server (EdServer, PTServer, CaServer, DeServer). Sometimes it will display a dialog box asking if you want to Retry or Delete a particular print job. You can try deleting the job. If that doesn't work, log onto to it as the "Administrator" using the password "hello6". Click on Start / Settings / Printers / Laser printer. Check to see if there are any jobs that appear to be stuck. Click on the job to highlight it, then click on Documents / Cancel to delete the offending job. (Sometimes there is something in the print job itself that crashes the printer.) The rest of the print jobs should start printing. If not, choose Cancel all print jobs to clear and reset the print queue.

Solution #4: Log out of Windows, then log back in

Solution #5: Reboot your computer

Solution #6: (after you have tried solutions 1, 2, 3, 5 and 5) If you are in the branch office, try rebooting the server. Log onto the system as in Solution #3. Click on Start / Shutdown / Restart computer. Let the computer close itself down and restart itself. The process should take about 2-3 minutes. Note: NEVER do this to DeServer in Delta. Delta's main server can only be rebooted afterhours.

After you reboot the server, you will have to reboot EVERY other workstation that routes jobs through this printer.

Firefox cannot see parts of Intranet

.Jan 28/07

Problem: Firefox error message: "This address is restricted: This address uses a network port which is normally used for purposes other than Web browsing. Firefox has canceled the request for your protection."

Cause: We use ports 81-90 for Deere DVDs that assume that they are in the root directory. Firefox gets worried unnecessarily about the nonstandard port use.

Solution: type: about:config in the address bar of your Firefox browser
- Right click somewhere, and choose "New => String"
- In the setting Name box type "network.security.ports.banned.override"
- In the Setting Value box, type "81-90"
- Click OK

Problem: Clicking on a "Files" link does nothing.

Cause: Another Firefox security "feature" that gets in the way.

Solution:
1) find your Firefox 'profile' folder. It's usually c:\documents and settings\[username]\application data\Mozilla\Firefox\profiles\[random number].default

2) Open Notepad, and copy this into it:
user_pref("capability.policy.policynames", "localfilelinks");
user_pref("capability.policy.localfilelinks.sites", "http://deserver");
user_pref("capability.policy.localfilelinks.checkloaduri.enabled", "allAccess");

3) save it as: user.js
The result isn't pretty, but it works.

Frontier intranet stops responding

.Oct 10/06

Problem: The intranet stops working for some reason, although DeServer is still running. Foxpro seems to operate normally

Cause: usually a misconfiguration with WestWind web connection. Might need to be fixed with a server reboot

Solution: Redirect users to DeBackup so Intranet still functions normally. Then reboot and fix up DeServer, and undo the redirection. Internet Services Manager > Deserver > Frontier Intranet > Fpp > (right click) Properties > "When connecting to this resource, the content should come from:" Change from "a directory located on this computer, to "A redirection to a URL." redirect to "http://debackup/fpp" "A permanent redirection for this resource" click "Apply"

Undo these steps when the problem is fixed.

Foxpro won't start

..August 29/06

Problem: error message: C:\PROGRA~1\Symantec\S32EVNT1.DLL. An installable Virtual Device Driver failed Dll initialization. Choose 'Close' to terminate the application.

Cause: Corruption in the Symantec antivirus files.

Solution: In the registry, go to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\VirtualDeviceDrivers. In the right pane, delete the VDD value. # In the left pane, right-click the VirtualDeviceDrivers key, and then click New > Multi-String Value. # Type VDD for the name of the new value. Exit the Registry Editor. Restart the computer.

Update the Symevnt files

Problem: Foxpro doesn't start. Error message is "16 bit MS-DOS Subsystem. NTVDM has encountered a System Error. The parameter is incorrect. Choose 'Close' to terminate the application"

Cause: Corrupt startup batch files?

Solution: Copy over all START*.BAT files from \\deserver or a backup

Deere Service Advisor problems

.. Jan13/08

Problem: Install hangs with the message "Checking system status, this may take a few minutes. Please wait..."

Cause: ServiceAdvisor install is checking for IIS, and installing it. However, a July 2007 Microsoft patch messed up IIS. So it hangs forever.

Solution: Deere has customized SP2 to work around this issue. Reinstall Window XP Service Pack from the DeereServiceAdvisor Install CD.
\support\Service Packs\WinXPSP2\EN\i386\UPdate\UPdate.exe


Problem: Hot links don't work from engine diagnostics page

Cause: A July 2007 Microsoft patch messed up IIS, so it can't serve up intelligent pages

Solution: Deere has customized SP2 to work around this issue. Reinstall Window XP Service Pack from the DeereServiceAdvisor Install CD
\support\Service Packs\WinXPSP2\EN\i386\UPdate\UPdate.exe


Problem: Deere Service Advisor tells you that you are running on a temporary licence. Downloading a new license freezes.

Cause: Conflicting or corrupt Java runtimes

Solution: Remove all Java and J2SE Runtime Environments from the computer. Remove using Control Panel Add/Remove programs, and Internet Explorer > Tools > Internet Options > Settings > View objects.
Reboot. Go to www.java.com, and install the latest Java Runtime. Delete all temporary files. Delete all cookies. Then, go to customperformance.deere.com, click on "Update software license" in the lower left. Licenses are stored in C:\jdlm. Delete the directory.

If none of this works, reformat the hard disk and reinstall the operating system.

Internet Explorer information bar pops inappropriately

.. Oct 28/05

Problem: When viewing a CD or a local file, Internet Explorer helpfully pops up an information bar saying "To help protect your security, Internet Explorer has restricted this file from showing active content that could access your computer.

Cause: Silly security settings from Microsoft.

Solution: Microsoft have provided new options to turn off the security on local files to let "active content" run. Active content is anything that makes a web page dymanic--like Javascript or DHTML. Almost all modern web pages have some of this.

To run active content on all CDs without warnings, change a security setting in Internet Explorer:
* Open menu Tools+Internet Options+Advanced tab
* Scroll down to the Security section.
* Check "Allow active content from CDs to run on My Computer"

To run active content in all files on your hard disk or similar, then you need to:
* Make sure that "Allow active content to run in files on My Computer" is checked.

Note: With "Allow active content from CDs" selected, I have found that the Information Bar sometimes still appears saying that it has restricted active content, even though the content runs OK.

Microsoft HTML Help problems

.. Oct 3/05

Problem: When you open a .CHM help file and click on a page, you get a "page not found" error.

Cause: Silly security settings from Microsoft.

Solution: Windows is preventing the files running active content because it is coming from a network share.

Modify the ItssRestrictions registry entry to enable a specific security zone
Warning Enable only those security zones that you trust. Do not enable security zones about which you are not sure.

To modify the ItssRestrictions registry entry to enable a specific security zone, follow these steps:
1. Click Start, click Run, type regedit, and then click OK.
2. Locate and then click the following subkey: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\HTMLHelp\1.x\ItssRestrictions Note If this registry subkey does not exist, create it.
3. Right-click the ItssRestrictions subkey, point to New, and then click DWORD Value.
4. Type MaxAllowedZone, and then press ENTER.
5. Right-click the MaxAllowedZone value, and then click Modify.
6. In the Value data box, type a number from 0 and 4 [see below], and then click OK.
7. Quit Registry Editor.
Now try to open the CHM.

Values for MaxAllowedZone:
0 My Computer
1 Local Intranet Zone
2 Trusted sites Zone
3 Internet Zone
4 Restricted Sites Zone
For most CHM files, the value of 1 should be enough to allow use without opening up access from/to remote CHM files in email/internet locations.

Adobe Acrobat Reader in Internet Explorer

.. Apr 5/05

Problem: When you attempt to open a PDF file on the Intranet in Internet Explorer, you get a "page not found" error.

Cause: who knows? Could be an interaction between Adobe Acrobat Reader, Microsoft Antispyware, and Internet Explorer.

Solution: In Adobe Acrobat reader, Edit > Preferences > Internet. Uncheck "Display PDF in browser"

Kohler Power Plus

.. May3/05

Problem: User reports that they cannot get any illustrations when I go in to the parts book.

Cause: When users log onto this site for first time they are prompted to install the required ActiveX controls for this site to display drawings, when the user encountered this prompt, he chose not to install any.

Solution: On the area where the drawing displays there is a shortcut that can be used to install these components. Click it to install the ActiveX control.

Kohler says: You will be prompted to install two ActiveWebParts(tm) components. You must install these components to view the parts illustrations on your computer. Click the "Install Components" at the bottom of the window to begin the installation. You only need to install these components during your initial login.

Problem: Cannot install ActiveX controls. You get the error message

Failed to install required ActiveX controls. ActiveWebParts requires you to download several ActiveX controls the first time you use the application, and whenever there is an update. The security settings for your Operating System require that you go through your local network administrator or IT department. Contact your local network administrator and this person will allow you to install the ActiveX control.

Cause: Security settings in Internet Explorer don't allow ActiveX control to properly install.

Solution: Sign into Kohler PowerPlus. Copy this link for the ActiveX installation page:
http://www.kohlerpowerplus.com/installation.asp?mode=greeting&url=http%3A%2F%2Fwww%2Ekohlerpowerplus%2Ecom%3A80%2Fmain%5Fframe%2Easp%3Fpage%3Dproduct%5Fcatalog%2Easp
Click the popup bar at the top of the page to allow the ActiveX control to install.

HotmetalPro crashes

.. Feb 3/05

Problem: HotMetalPro crashes. This usually happens when saving a file, or shutting down HotMetalPro.

Cause: corrupted workspace. It's trying to do something but the record for how to do it is broken somewhere deep in the registry.

Solution: File > Workspace > Manage. Delete the current workspace. Quit HotmetalPro. It will automatically recreate the workspace when it restarts.

Deere ECU programming problems

.. June 14/05

Problem: cannot download a package for an engine. After the screen that says "License information required for Programming an Electronic Control Unit (ECU) is being validated. Please wait while validation takes place. If the validation is complete the next step will appear. If an error occurs, please contact your systems administrator for further assistance.", you get an error screen that says "An error occured while retrieving the license file. Please contact your System Administrator."

Cause: invalid Java runtime environment

Solution: to to www.java.com and download the latest Java Runtime.

Problem: You cannot download the package for a specific engine

Cause: misconfigured or missing directories

Solution: First, check the troubleshooting document in the Electronic Service Tools section of the Deere distributor home page. If no luck, use c:\program files\ECULP\ECULPINIConfig.exe to set the paths after creating them.

Problem: After downloading the package for an engine, you plug in the engine and nothing happens.

Cause: Engine was not ordered with configurable options.

Solution: Call DTAC and open a case. This will cost US$150 to reprogram the engine--if they determine that it is possible.

Kubota K-ISS issues

.. May 10/07

To install K-ISS, you need to:
1) Internet Explorer > Tools > Internet Options > Security > Trusted Sites > Sites...
- add both "kubota.ne.jp" and "kubota.co.jp" to trusted sites
- uncheck "require server verification (https:) for all sites in this zone

2) Instructions are at http://www.engine.kubota.ne.jp/english/help/index.html
Your password is the same as your userID for this section.

3) log in using the userID and password they sent you. You need to change your password every 40 days, so unfortunately YOU are responsible for your password.

Setting cookies & security settings - appendix C
Troubleshooting - appendix D

"error reading information from netmap.inf" when installing Command Console

.. July 18/03

Problem: When you try to install the Recovery Console with the \i386\winnt32.exe /cmdcons command, you receive: error reading information from netmap.inf.

Cause: unknown

Solution: Find a Windows XP Professional/Windows 2000 Professional CD (whichever is appropriate) with lastest Service Pack pre-installed, and install the command console from the CD.

Defrag Commander reports a disk error

.. Geoff, July16/03

Problem: On some of my drives, the Defrag Commander log file reports that a drive is corrupted and that check disk needs to be run, but when I run check disk, no errors are found and the same error is reported in the next defrag job.

Cause #1: Bug in Windows 2000 defrag APIs utilized by Defrag Commander. For more details on this bug, please see Q320866
Cause #2: you can't run Defrag Commander on spanned volumes.

Solution: run Chkdsk /f to check for errors. If none are found, try running Windows built-in defragger. This may point you to the offending file or directory, which could then be deleted or moved.

Print jobs get printed again and again

.. Geoff, June 11/03

HP says: When printing to a shared HP LaserJet series printer in a Microsoft Windows 2000 or Windows XP environment, jobs sent from a client PC will not successfully print or will print garbage text or PJL commands. Typically, print jobs sent from the printer driver do not complete or spool properly. Print jobs sent, using an alternative printer driver, will print garbage text or PJL commands. Jobs sent from the host PC print successfully with no issues.

We have found this to be a problem when jobs go from the server to a HP print server. The spool files don't get deleted, and can reprinted again and again.

On the server, click Start, Settings, and then Printers. Right-click the HP LaserJet series printer driver icon; select Properties. Click to open the Ports tab. Uncheck the box for Enable bi-directional support. Click Apply and then click OK to exit the driver. Test print from the client PCs. If the printing is unsuccessful, try Workaround 2.

Click here to download the latest software and drivers from the HP web site

Workaround 2: Install an alternate HP LaserJet 1200 printer driver on the host PC From the Windows desktop, click Start, Settings, and then Printers. Click the Add a Printer icon. Click Next after the Welcome screen appears. Select Local Printer, and then disable the option Automatically detect and install my plug and play printer. Click Next. In the Ports screen, select the appropriate port for which the printer is configured. Click Next. Select HP as the manufacture and the HP LaserJet 1200 series PCL for the printer. Click Next. Select Replace the existing driver. Click Next. Select Yes to set the HP LaserJet 1200 driver as the default driver. Click Next. Share the printer, using a name less than 12 characters in length. Click Next. Add location and comments if desired. Click Next. Click Yes to print a test page from the host PC. Click Next. Click Finish. If the test page prints from the Host PC, try printing from the local PCs.

See also below:
Printing problems
HP JetDirect print server

Frontier Intranet does not show costing

December 10/02

Problem: Frontier Intranet programs do not show costing. The Intranet calendar program shows DeGate, Cagate, or Edgate instead of your computer name.

Cause: The Intranet programs look for the computer's IP address. If Internet Explorer is sending the request through the Proxy Server (Microsoft ISA server), what it sees is the IP address of the DeGate, instead of your address. As a result, it gives you a low security rating, which doesn't let you see any costing.

Solution: Internet Explorer > Tools > Internet Options > Connections > LAN Settings > Advanced > Do not use proxy server for addresses beginning with:
set it to "172.16.*" (without the quote marks)

Symantec Anti-Virus

..updated Jan24/08

Problem: Symantec Anti-Virus Corporate Edition - out of date virus definitions

Cause: Workstation is no longer connected to its parent server.

Solution: Have Daniel or Geoff check Symantec System Center on DeServer. See if the workstation is listed under its parent server. If not, reboot the workstation. It should appear on the list within a couple of minutes. If not, reinstall SAV Corporate Edition using Tools > SAV Client Install

Kubota KIEPS Parts Books

Problem: Kubota KIEPS asks you to Insert a CD when you start the program.

Cause: This program needs to have drive "K" mapped to the Kubota CD. This type of problem occurs with several obsolete computer programs.

Solution: Start Foxpro. The file that start Foxpro also creates all the required drive mappings to run Kubota's program.

Outlook Express (Internet email)

..updated Jan24/08

Problem: Outlook Express asks you for your password.

Cause: When there is any problem connecting to your mail server, Outlook Express assumes that it is a password problem, and pops up a box like this:

It is unlikely that the password you have used for months and months is suddenly invalid. And the problem is that if you put in the wrong password here, that's what Outlook Express will remember. And that means that you won't be able to get your mail. So be careful!

Solution#1: Can you connect to the Internet using Internet Explorer? If so, the problem is the mail server. Click Cancel, and check back in a few minutes (or even a few seconds) to see if it is working again.

Solution #2: Are you running Zonealarm or another personal firewall product? Try shutting it off by right clicking on the icon in the taskbar and selecting 'exit'. In a minute or so your computer should be able to connect. If this is the case, we need to reconfigure Zonealarm so it stops interfering with your work.

Solution#3: Maybe Outlook Express has actually lost your password, or a file has become corrupted somewhere. This is unlikely, but--make sure you know your email User Name and password. These are NOT the same as your usual userID and Frontier password. (The password is NOT "hello6"). Check with Geoff for your password, and write it down! Try re-inputting your password.

You might have an email address of jdoe@frontierpower.com, but that's not your real email address--it's just an alias. Your real email address might be fel17@primesignal.com. The email addresses and email aliases are in the vendor file under 'NetNati'.

Problem: Outlook Express says that "Your POP server has not responded in 60 seconds. Would you like to continue waiting?

Cause: There are several processes that happen when Outlook Express is retrieving your mail. It does not understand what is going on--it only keeps track of how many seconds have gone by since it requested the mail. Sometimes the POP/mail server is busy, so it takes 10-30 seconds to respond. If there is mail, Norton Anti-Virus reads and analyzes it before it gives it back to Outlook Express, so this can take a few seconds.

So this message does not mean that there is a problem. It is merely an annoyance.

Solution: Click on Tools / Accounts / Properties / Advanced. Move the Server Timeouts slider to the right to at least 1 minute, or even longer.

Problem: Symantec Anti-Virus tells you that you have a virus, and Outlook Express gets stuck reading the mail.

Cause: The mechanism employed by Symantec Anti-Virus chokes on the KAK.WORM virus.

Solution: Access your account using Webmail and delete the offending item. Read about Webmail in the Remote Computing document

Basler Troubleshooting Notes

April, 2000

Problem: Basler program hangs during spash screen.
Cause:
Other programs are hogging processor time, so the Basler program is cycling around, trying to accomplish stuff.
Solution: Try exiting or quitting Foxpro. You can do this while the Basler screen is up and running.


Problem: Cannot connect by modem to remote generator set.
Cause #1:
Basler software communications parameters not configured the same as the Basler regulator.
Solution #1: Do this while NOT connected. Start up the Basler software. Click on Configuration. Click on RS232. Settings should look like this:


Cause #2
: Basler software does not work with WinModem's. A WinModem is a particular type of modem which is software based. Some built-in modems are WinModems.
Solution #2: Check to see what type of modem you are using. Click on Start / Settings / Control Panel / Modems. If your modem says "WinModem", it is not compatible with Basler's software. Todd Stewart at Basler toddstewart@basler.com said he would look into this problem and see if he could resolve it on his April 4/00 visit to Frontier Delta. Use a different machine or a different modem. External modems seem to work OK.

Cause #3: Spartacom SAPS modem sharing software incompatible. The Basler software establishes communications, then quickly closes and reopens the comm port so it can switch communications protocols. The closing/reopening process tells the SAPS modem sharing software to terminate the session and establish a new session. This hangs up the modem.
Solution #3: No fix yet. Spartacom tech support contacted April 4/00. If they find an answer, we'll let you know here, and we will also inform Basler.

Cause #4: Telular remote unit has low signal strength. This is a possibility put forward by American Tower.
Solution #4: American Tower says they can cure this with appropriate antennas. They will take responsibility for this problem, as long as we can get the units working with landline modem connections.

Problem: Blank message box comes up after dialing, but before logging in. Clicking OK disconnects the modem.
Cause: Programming issue with Basler Bestcoms software.
Solution: The blank message box in itself is not a problem. Just wait a few more seconds until the modems finish negotiating a connection. When you click OK, you tell the Basler software to switch protocols, so this should not be done until after the modems have connected. Todd Stewart at Basler says that this is an issue he will address in a future software release.

Problem: Password box comes up even if connection failed.

Cause:
Basler programming issue
Solution: None, and none forthcoming. Todd says that this is something he cannot work around in the current program.

Problem: Password is rejected.
Cause:
Password is case sensitive.
Solution: There are 2 passwords, the "customer" password ("DGC"), and the "able to mess around" password ("FRONTIER"). Both are case-sensitive. Make sure you put it in in capitals.

Printing a Windows Screen

In order to print a graphics screen, you need to capture the screen, copy it into a graphics program, then print it. Here's how to do that.

First, make sure that Microsoft Photo Editor is installed on your computer. If it is not, send an email to Geoff asking him to install it.

To capture the entire screen: press the [Ctrl] and [Print Screen] keys.

To capture just the window you are working in, press [Alt] and [Print Screen] keys.

Copy it into Microsoft Photo Editor. Start up the program, and click on Edit / Paste as New Image. Your screen image magically appears. Now you can print it, or save it, or email it.

HP Jetdirect Print Server

support telephone number: 208-323-2551
part number: J3265A (HP Jetdirect 500X print server)
Print a "test" page. Note what the "Status" and "Activity" LED's are doing (on/off/blinking?)

One possible problem is when someone runs over the LAN cable. The JetDirect uses all 4 pairs of wires inside the cable, whereas a PC only uses 2 pairs.

Inter-branch connection unstable

.. Oct 19/04

Problem: Remote print servers go offline for a few minutes, then come back online. Connection seems to go up and down. Or Windows messenger goes down for a few minutes, and requires logging back on. This problem can continue for days

Cause: flakey ADSL connection

Solution: unplug the ADSL router for 10 seconds. Plug it back in.

If that doesn't work, call the tech support line for the ADSL service provider. They will want you to have done this.

Inter-branch connection down

.. May 18/05

Problem: The link between Edmonton and Delta is down

Cause: The Edmonton Telus ADSL line is down; or the cleaners in Edmonton unplugged the Sonicwall TZ170 VPN box, or the Delta Sonicwall TZ170 has failed.

Solution: You will substitute some IP addresses if the issue is Calgary

Delta Sonicwall TZ170- 172.16.1.1

Edmonton Sonicwall TZ170 - 172.16.2.1

Calgary Sonicwall TZ170 - not yet installed

DeServer - 172.16.1.10

EdServer - 172.16.2.10

CaGate - 172.16.3.10

Delta 3Com VPN - 172.16.1.3

Edmonton 3Com VPN - 172.16.2.3

Calgary 3Com VPN - 172.16.3.3

1) connect to Edmonton You can try connecting to Edmonton 3Com VPN box (see the network map), or use the Sonicwall Global VPN client to connect to the Sonicwall.

2) route packets from Delta to Edmonton via the 3Com VPN box

3)Edmonton: Route packets from Edmonton to Delta via 3Com VPN box

Once the Sonicwall link back up and stable, change it back. Do this ONE STEP AT A TIME, and test to ensure connectivity with each step. There should not be even 1 second downtime:

That should do it

Ways to get data off a failed computer

.. Jan 22/07

If a computer won't start normally, here are the options we have tried (in order of difficulty) to get them up and running so we can get data off of them. This is a good reason why important data needs to be backed up regularly:

Recovering unbootable systems

.. Jan 20/04

Boot failure troubleshooting chart

Windows 2000, Windows XP and Windows 2003 Service Packs and hotfixes are usually designed to fix problems, not aggravate them. Unfortunately, some hotfix or Service Pack installations have been known to damage a system to the point where it will not boot correctly. There are a plethora of reasons for this -- for instance, a mismatch between patched and existing components or damage to the Registry for some reason other than the application of the Service Pack of hotfix.

If a system has been left unbootable due to a Service Pack or hotfix, the first course of action would be to boot to Safe Mode and remove the SP or hotfix. But sometimes even Safe Mode is not accessible, and even the "Boot from Last Known Good configuration" does not work. In cases like this, the best next step (short of a repair operation or a parallel install) is to use the Recovery Console to manually restore the files that were replaced by the Service Pack or hotfix.

Here is the procedure:

Other options include Remote Recover, which involves finding an NDIS driver for the dead machine. Use this to salvage very important data off the dead computer (which is a reason never to have data in just one place).

Other best practices involve putting the data on a separate logical disk from the system, so you can always reformat and reinstall the operating system without losing data.

How To Troubleshoot Any Networking Problem

.. Mark Minasi, Feb/06

Every day of the year, I get a bunch of e-mails from people trying to solve network problems.  And while I love to help, I'd like even more to show folks how to solve any problem on their own.  So it occurred to me that I've slowly learned that there are a bit over two dozen "rules of network troubleshooting."  I then put together a 90 minute talk on it, and I've had the chance to do that talk for audiences of up to a thousand people to good reception but as always, I can't get everywhere, so what follows is some of that talk.  My intention here isn't to reveal any hidden Registry entries or point you to some heretofore-secret $40,000 network diagnostic device.  No, I just want to offer what's worked for me in solving network troubles.  I'm sure some of this will be simply a reminder of what you've already learned, but I find, at least in my case, that it's all too easy to forget a rule and have to re-learn it, painfully!

Separate the C and Si Problems

I've solved a lot of network problems, but this one was a toughie.

"I've got a DHCP server that is delivering IP addresses to two segments.  The systems on the same segment as the DHCP server are getting IP addresses with no trouble, but the systems on the other segment, none of them work!" 

My first question (and probably yours, if you're a network techie) is, "does the router between the two segments pass DHCP requests?"  (In geek-ese, you may know that the other way to say this is "does the router support RFC 1542 BOOTP forwarding?")  Or alternatively, I ask, "is there a DHCP forwarder on the second segment?"

"Yes," the person replies, explaining that the router passes BOOTP packets.

Hmmm.  So what else might it be?  Check IP connectivity -- does the router block any particular port?  If it's in a network with an Active Directory and the DHCP server is on a 2000 or 2003 server, has that server been authorized in AD?  No port blocks, and yes, it's been authorized.  That's when I realize that it was a stupid question -- if DHCP weren't working, the first segment wouldn't have IP addresses.  Ah, but what if -- a eureka moment! -- somehow (1) the DHCP server hadn't been authorized for the past six days and for some reason all of the systems on the nearby segment still had lease time left but all of the ones on the second segment had their leases run out earlier, and so were the canaries in the coal mine?  So I tell the person to try to do an IPCONFIG /RENEW on one system on each segment.  The one of the first segment succeeds, the one on the second doesn't.

Ready for the answer?  It's simple:  the guy had no idea what the heck BOOTP forwarding was, figured that his router guys must have allowed for that -- after all, they did go to a CCNA boot camp -- and just told me what I wanted to hear.  In other words, it is always possible that the carbon-based parts of the network ("C" is the symbol for the element carbon) don't report reliable information, and so the problem lay not in the silicon part of the network ("Si" is the symbol for the element silicon) but in the carbon component.  To paraphrase Shakespeare, "the fault, dear Brutus, lay not in the chips but in the people."

Don't misunderstand me, I'm not saying that everyone lies or is incompetent.  But I am saying that under stress people don't always think as clearly as they should, and that network support people have had a lot of new things thrown in their laps in the past few years -- remember when we "discovered" security in 2001, or that we all need database servers whether we want them or not in 2004? -- without receiving a concomitant increase in staffing.  We're all just human.  We make mistakes.  Think about how we make silicon-based systems more reliable:  we cluster them.  The same thing works for carbon-based units:  more eyeballs looking at a problem often make for a more quickly-solved problem. 

And -- this is important -- remember that we techies tend to think of computer problems in terms of the silicon side sometimes more than we do the carbon side.  In fact, sometimes we see the carbon side as being sort of minimal, and only relevant in a few cases.  But if you sit back and think about most of the things that you have to fix, you'll end up seeing that most of those problems have a carbon component that is at least as important as the silicon component.  I mean, Trojans don't write themselves, y'know?

Write Things Down

Now and then, I'll run into some problem that doesn't surrender itself to my charms quickly.  I circle it, nudge it, try a lot of things, and finally fix it.  Finally!  It's been hours, the whole process was mildly traumatic, and so I say to myself "well, I'll never forget that problem and solution."

But, of course, I'm wrong when I say that.  Because believe me, there's another trauma lurking tomorrow, or next week.  And that new crisis will flush out the old memory.  So I try to be methodical about writing down things that have vexed me and their solutions.  My advice:  keep a notebook, or some electronic version of a notebook.  (I have over 500 memo files in my Palm.)  You'll slowly build yourself a powerful "knowledge base" of your own.

Oh, and writing things down has another benefit:  it slows you down.  It's terribly easy to rush into a problem, running down some blind alley that seemed like a familiar problem/solution pair; taking the time to express the problem in a new format -- writing rather than just thinking -- can often wake up parts of your brain that weren't previously paying attention to the problem that you're trying to solve.

What Changed?

So the system worked yesterday, but it doesn't today... what did you(or someone else with an administrator account) do to it?

Clearly a quick look at what's changed in the past few minutes, hours or days is a good idea and certainly not one that's only occurred to me.  But quite honestly it can sometimes be difficult to figure out exactly what has changed in a software environment like today's.  Did this happen on the Wednesday after the second Tuesday in the month -- in other words, is it the day after Patch Tuesday, when Microsoft releases its monthly patches?  That's a lot of changes!

Sometimes, of course, finding out what changed is one of those carbon/silicon problems.  Recently a friend asked if I could figure out why she could no longer access her office XP desktop that she'd previously been able to get to from her home system.  I asked her if she'd done anything different.  Nope, nothing, she said.  Didn't you tell me a week or two ago that you'd had some kind of issue with a virus, I asked?  Oh, sure, she replied, she just had to update her antivirus suite.  I guessed that the suite included a personal firewall, like most modern ones and, sure enough, by "just updating" her anti-virus software -- the software, not the pattern files -- she'd installed a personal firewall.  The firewall blocked port 3389, of course, and so Remote Desktop didn't work.

Use Your References

Very few of us have the time or brain power to stay on top of all bugs, patches, upgrades and the like.  Sure, it's terribly manly to be able to quote Microsoft Knowledge Base articles by number, but, that's really just showing off, right?  That's why it's a good idea to rely on the many on-line resources that can help you in solving network problems.  At the top of the list must be

Record the Exact Error Message

Ever since the growth of the Web, this has become absolutely essential.  There are so many programs and so many error messages, some of which point to the tremendous numbers of bugs and workarounds, that an exact copy of the error message and, if possible, an actual event ID are often the key to figuring out a problem.

For example, a while back I ran into a problem with ntbackup.exe, the Backup program that comes with Windows.  I needed to recover a lost file and so popped a tape into the server's drive.  The tape whirred for a bit and then Windows 2000 asked me to insert the tape.  Hmmm... I just did insert the tape.  My right eyebrow rose involuntarily and my heart sank as I realized that I might just not be able to get that file off the tape after all.  So I searched the Knowledge Base for "ntbackup" because, again, that's the program's name.

That was my first mistake.  "Ntbackup.exe" may be the name of the file, but as far as the Knowledge Base is concerned, "Backup" is its name.  "Ntbackup: nets a small number of articles, none of which helped.

My second mistake -- like many people, I get dumber when I get frantic -- was not to write down the error message.  So, as the saying goes, I jumped on my horse and ran off in all directions trying to solve the problem until my brain calmed down.  I wrote down the exact error message and went to the Microsoft KB page.  Then -- my brain was working now -- I did not use the "Search" field on the KB page, but instead used the Google Toolbar to search the KB by typing the error message into the Toolbar and clicking "Search This Site."  The answer came up -- delete the tape library file on disk, reinsert the tape and let Backup re-catalog the tape -- and got the file.

Double-Check The Antivirus and Antispyware

It'd be great if anti-virus and anti-spyware tools were intelligent, intuitive, and invisible. But they're not, of course.

First of all, AV/AS can be the cause of network troubles.  Almost anyone who runs AV/AS software on their system has a tale to tell about how he couldn't install some piece of software (or, more mysteriously, a driver), only to find that it was the AV/AS system that kept it from installing.  Worse yet, some users are smart enough to have figured that out, and so they disable the AV/AS software so that they can get something done... and forget to turn it back on. 

Or it could be that, as someone once said, "an antivirus program with old pattern files is better than no antivirus at all... but not by much."  I can't tell you how many times I've been asked to look at someone's system to solve some strange behavior.  "Do you have antivirus software loaded and have you scanned recently?," I ask.  "Sure, they reply, I scan weekly."  They just haven't noticed those annoying little pop-ups that keep telling them that their subscription has run out and that they're not getting pattern file updates any more.  No wonder they've got Zotob...

But let me go a bit off-topic and tell some truth about how I use antivirus software.  In general I don't use antivirus software, because it does get in the way of running a system.  How, then, do I protect myself from viruses?  With what I consider to be the best AV tool around:  a working brain.  Yes, I understand that you've got non-technical users and that AV/AS software helps keep them from doing dumb stuff.  But malware appears more and more quickly and exploits new bugs more quickly -- how often do you install new pattern files?  In the end analysis, most of what you need to protect your systems from viruses, worms etc is just to teach your users to be careful about what they click on when they visit Web pages, to develop an intuition about what kind of Web pages to visit and to be careful about what attachments they open in e-mail. 

Think of this way:  if you got an e-mail from me that said something like "Dood... U mUst check this out!  Grate info inside!!!!" with some attachment that called "invoice.exe."  Would you say to yourself "heck, it's from Mark, I'd better open it," or would you say "hmmm... either Minasi has lost the ability to use standard grammar and spelling, or this might not be on the up-and-up."  You could then either e-mail me back and ask if I meant to send you this e-mail (which is exactly what I did when I got my first copy of the "I love you" virus ages ago), or you could take a moment, update your pattern files or just visit some malware news site and then decide whether or not to open it.

When you think about it, that's not a terribly hard skill to share with your users -- just a little "street smarts" for the Net.

If, by the way, you find yourself working on a system but do not have an AV/AS package handy, visit www.antivirus.com.  It's a very nice, free ActiveX malware scanner offered by the Trend Micro folks.  It's a nice, free public service, and it's why when people do ask me for a recommendation on an AV/AS package I point them Trend's way.  And while you probably know this already, the Microsoft Anti-Spyware Tool is a darn good piece of software, and free besides.  The www.grisoft.com site also offers an antivirus package that's free for home users.

Wait 15 Minutes -- Microsoft's Favorite Time Interval

You may have heard this from a user once or twice (or a thousand or two thousand) times:

"I just logged on, but I can't see my computer in Network Neighborhood.  What should I do?"

Of course, the real answer might be something along the lines of "well, actually Network Neighborhood is supposed to list servers, and your workstation isn't a server," but that's not going to help much.  The reason that the machine hasn't yet appeared in Network Neighborhood is because the process that keeps Network Neighborhood up to date isn't instantaneous; instead, it's built to reflect network changes within about 15 minutes.

15 minutes seems to be a lucky number in the Microsoft world.  A number of large and small things are intended to get done within 15 minutes.  For example, in addition to Network Neighborhood, Active Directory's Knowledge Consistency Checker tool -- which is embedded in every domain controller -- wakes up every 15 minutes to re-check that its DC's replication partners still exist, and the KCC also ensures when choosing those partners that no matter how many DCs exist on a site, then any change to the AD that occurs on one DC will be transmitted to every other DC in the site within ... you guessed it ... fifteen minutes.

So when you make some kind of change to your network and the change seems not to have taken effect, relax... take the 15 minutes and document what you've done so far.  Draw a picture of what you did, list the steps, and you'll often find that by the time you're done, either the change is apparent, or in the process of listing what you did, you discover that you forgot a step.

Check:  Is it Plugged in?

Check it twice...

Okay, seriously, again I mean no disrespect to users or anyone else.  It's just so easy to overlook the things that we can easily take for granted.  I mean, when someone falls to the floor, do you immediately whip out your oxygen sensing probe and check that there's a detectable amount of oxygen in the room?  I know -- the fact that you didn't also fall down kind of negates the need.  But you know what I mean; we take the mundane for granted.  And we take the reliable for granted; in billions of connections around the world transmitting gazillions of bits, things are plugged in 99.9999-plus percent of the time.

This is where checklists can be of help for two reasons:  first, to remind you to check even the unusual stuff, and, second, as an excuse for asking someone else what sounds like an insulting question.  Asking someone if everything's plugged in can make someone who's already upset more upset and angry.  So before you ask, couch it as a bit of a joke -- "Now, I've got to ask you these next few questions and they are, well, a bit silly.  But my boss makes me ask them anyway, so... would you take just a moment and crawl under your desk to make sure that everything's plugged in?  I didn't ask recently, that turned out to be the problem, and I got my butt barbequed for calling in the third-level help desk guys on the job, only to have them reach down and plug in a CAT5 cable that had fallen out of the back of the user's laptop." 

Assemble Your Toolkit

You probably fix many network problems away from your desk, at a client's machine.  Or, for that matter, skip "client" -- if you're like most of us network techie types, then you're probably not only tech support for your organization, you're also tech support for your friends, family, and neighborhood, so this might be gratis work.  But no matter how much you're getting paid or not getting paid, the fact is that I can guarantee that any time you try to fix something away from your normal workspace, then there's an extremely good chance that you'll figure out the problem...

... and then realize that the tool to fix it is back at your desk.

So do a little thinking about what you need close to hand, and assemble a toolkit.  Yours might include

Your toolkit will almost certainly contain different things; this is just a start.

Check IP Connectivity

If system mypc.acme.com doesn't talk to system yourpc.bigfirm.com, then there may be many reasons for that.  But the simplest should be the question, "do they have basic IP connectivity?"  Almost everyone reading this knows this, but let say it anyway:  the simplest and most easy to find tool in this category is "ping."  You can either ping to a particular IP address or a host name.  So, for example, if I'm at machine mypc and machine yourpc.bigfirm.com has an IP address of 10.50.50.70, then I can test the IP connection between mypc and yourpc by typing

ping yourpc.bigfirm.com

or

ping 10.50.50.70

And, assuming that all is well, then ping will either tell me that yourpc responded, or didn't.  Now, for those of you who've been doing this for a long time, forgive me -- that's Internet Troubleshooting 101.  But recall that either one of those pings may fail even if there's a perfectly good connection between the two systems.  Why?  Two reasons; let's consider the first.

The target system -- pc32, in this case -- may have a firewall that keeps it from responding to ping requests.  That firewall may exist in the form of a hardware firewall or router, a piece of networking equipment between mypc and yourpc.  Or the system software sitting right on yourpc may have a software firewall of its own which blocks responses to ping requests.  Software firewalls used to be unusual, but since mid-2004, with the advent of XP's SP2, they've become quite common.  And while I like the idea of firewalls in general, I believe that blocking pings is a bad idea.

Network folks decide to block pings because criminals like to use ping to detect systems on a network.  Once they've detected a system, then they can try to attack that system.  (Let me repeat that.  The mere fact that a criminal knows that you have a system at IP address 177.44.32.19 does not mean that he now has complete control of your system.  It just means that he knows that there is a system there.  He has, as of yet, no knowledge at all of how secure or insecure your system is.  It's sort of like saying, "if I build my house underground and grow grass on the roof then I'll never be burgled."  Certainly it would reduce the probability, although it wouldn't negate it altogether.  But it would reduce the enjoyment that most people would get out of their houses.)

So network security people believe that telling a system to ignore ping requests either via a hardware or software firewall secures their system by thwarting bad guys from the start.  Ping runs atop a piece of Internet software called the Internet Control Message Protocol or ICMP.  Thus, software firewalls don't always have a check box saying "block ping;" they may instead offer the ability to block ICMP.  I've already said that I think blocking ICMP is a bad reason; here are a couple of reasons why.

In the first place, there are network programs that rely upon ping and with ping blocked, strange things happen.  For example, domain controllers in Active Directories need to respond to ping in order for group policies to work correctly; if a DC doesn't respond to pings then its client thinks that it -- the client -- has dialed up to the domain rather than is connected on the LAN, causing the client to ignore logon scripts, software installation and folder redirection.

In the second place, there are tons of other ways for bad guys to find you.  Ping just tickles ICMP, and disabling ICMP again essentially renders ping deaf.  But IP, TCP and UDP have many "ears" -- you know them as ports -- that do not need ICMP to function.  There are several tools out there that do what ping does, but not by using ICMP; instead, these tools look for activity on a particular TCP or UDP port.  (There are 64K of each of those ports, in case you didn't know.)  And most functioning systems in a corporate can't afford to make all of its ports deaf.  Let's use for one example a nice Microsoft command-line tool called portqry; you can find it at www.microsoft.com/downloads.  It lets you essentially "ping" any port that you like.  So let's say that we want to find out if Microsoft's Web servers are active.  We could ping them, but we won't get a response; for some reason Microsoft has disabled ICMP on their Web servers.  But Web servers aren't of much use unless they communicate on port 80, so we'll just use portqry as a kind of "ping for port 80," by opening up a command line and doing the following:

E:\>portqry -n www.microsoft.com -e 80
Querying target system called:
www.microsoft.com
Attempting to resolve name to IP address...
Name resolved to 207.46.18.30
querying...
TCP port 80 (http service): LISTENING

Where ping fails, portqry does an admirable job.  So why bother telling your software firewalls to block pings?  Leave everything else blocked if you like, but save yourself troubleshooting time down the road and just tell your software firewall to allow pings.  Tracert uses ICMP also, so you may find that command doesn't work.  As I mentioned before, however, PC Pitstop has a nice Web-based tracert that works even on sites that have disabled ICMP.

Isolate Name Resolution

I said that a ping might fail even if mypc and yourpc were both functioning for two reasons.  The first was, as you just read, firewalls.  The second is name resolution.

This is a large topic and one that I've covered in the Server 2003 book, so let me keep this brief.  Our systems have IP addresses, like 10.60.60.3 or the like, and they're perfectly happy to respond to requests to those IP addresses.  But we humans are less happy with a name like 10.60.60.3 and more happy with a name like pc31.bigfirm.com or \\MYPC.  Those names aren't for the use of the computer, they're for our use, so if the system with IP address 10.60.60.3 has a DNS name of pc32.bigfirm.com and a share named DATA, then we can map to that drive either by typing

net use * \\pc32.bigfirm.com\data

or

net use * \\10.60.60.3\data

Both have the same end effect, assuming that all's well.  But under the hood, the first NET USE needs to do an extra step that the second one doesn't.  Before the network software can contact pc32.bigfirm.com to start establishing the connection to the file server, then it's got to first stop and ask DNS, "what is the IP address associated with the system named pc32.bigfirm.com?"  For that to work, you need a functioning DNS server.  In the case of the second NET USE, you don't need a DNS server, as the question "what is the IP address of the server that you want to contact?" is already answered.

Now take that information and consider:  what if we typed that first command, but our DNS server was inoperative or our system was misconfigured so that it couldn't find any DNS server at all?  Then when it starts to do the job, your system queries DNS to find the IP address of the target file server.  DNS can't answer the question either because there isn't any DNS server, the DNS server is configured badly, or the DNS server is inoperative for some reason.  Your system never gets an answer from DNS, and so cannot go on.  Now, in the perfect world, your system would say "there may well be a file server there, but I have no way of knowing, as I can't even get started doing what you've asked because of a DNS failure.  Try the command again, but substitute the file server's name for its IP address; I may be able to connect you then."  But instead you get some short, cryptic answer.

The same thing would apply to a diagnostic command like ping; telling a system to ping 10.60.60.3 tests only the cables, NICs and routers between the system that issued the ping and the one at 10.60.60.3.  But telling a system to ping pc32.bigfirm.com requires not only the cables, NICs and routers, but the DNS infrastructure as well.  So when testing things, try to first direct your tests at IP addresses.  Then, if that test works, then try the test again, but this time call the target by its name, not its IP address.  That way, if the first test passes and the second fails, then you can be pretty sure that the problem lies with the name resolution -- the DNS or WINS servers, probably -- rather than with the IP infrastructure.

Know WINS Versus DNS

Speaking of name resolution, understand that every system in the Windows world has two names -- its WINS name and its DNS name.  WINS is part of NetBIOS, an old way of naming systems that is almost exclusively used in the Microsoft world.  It's supposedly obsolete but it is still so embedded into Windows systems that, well, Microsoft's been trying to root it out of Windows for six years and has a ways to go before they succeed.  DNS, in contrast, is an Internet standard that's is used both in the Windows and non-Windows world and that Microsoft embraced in 2000 with the advent of Active Directory. 

Okay, you ask, what does that all mean?

Well, again, nearly every Windows computer has a WINS/NetBIOS name and a DNS name.  Some programs need to hear a DNS name, other programs need a WINS/NetBIOS name.  Some programs will take either.  Sometimes it's easier to think of this not as names but as "identifiers."  For example, I've got a phone number, e-mail address, and a street address.  They're all "identifiers" for me, they're all "names."  To call me, you'd need the identifier that is the phone number -- knowing my street address will not help you call my phone or send me an e-mail.

Similarly, not only can networks run into general name resolution problems, they can also suffer from "wrong name resolution" problems.  So if you haven't already, read up on WINS and DNS.  This is important, particularly in the Active Directory world.  DNS causes at least half of the problems that look like AD problems.

Check the Logs

Modern software writes logs.  Lots of 'em.  Event log entries.  Logs of their own.  For example, did you ever have DCPROMO fail, only to offer an explanation about as long as a fortune cookie, but less helpful?  I'll betcha didn't know that in \Windows\Debug you'll find a file called dcpromo.log.  It's actually helpful sometimes.  Group policy has its own error log named userenv.log that you can enable with a Registry entry.  DHCP, DNS, and WINS will log the heck out of themselves.

But many of us overlook logs.  Why do we do this?  I'm not sure, but I think it's that Windows trains us not to look.  When it thinks something's important, then it sticks an annoying dialog box in your face, even if the subject of the dialog box is not important, so we kind of assume that Windows will do something to get our attention any time something bad happens.  As a result, we're stunned that when we actually get a moment to look at an Event Log, then we see a sea of red crosses and some really scary events.

Check those logs.  There's often quite a wealth of useful stuff in there.  And make peeking at more than one system's Event Log easier by going to Microsoft's site to download a free tool called eventcombmt.  It'll grab logs from multiple systems, filter them as you like and produce summary reports.  It's a bit rough-edged but it works well and, of course, the price is right!

Simplify the Problem

I wish I had a dime for every time I get an e-mail that starts off, "I'm trying to make [fill in the blank software] run.  The client and server just don't talk."  I'm surprised because this often refers to software that I use a lot.  I ask a few questions, get a few answers, am still confused, and then the light turns on.

"Are these two systems on the same segment?"

Well, actually, no, I usually then hear, and with a bit more prodding I discover that there are two firewalls, a public cable Internet connection and a NAT router between them.  I take ten deep breaths, then suggest that they test it on a single segment with no other hardware between them.  Then, if it doesn't work, then it's probably a serious problem with whatever software they're using.  If, on the other hand, it starts working, then it's time to re-insert those devices one at a time until things stop working.  Then it's clear what needs a bit of configuring.

Look for the unusual.  Is your test machine, or the one that you're trying to install to, unusual in some way?  Is it multihomed?  Is it a virtual rather than a physical machine?  Does it run some kind of software firewall and, while we're at that, consider doing the test on an isolated segment and turn off the anti-virus and anti-spyware software.  Again, if everything works in that situation, then turn things back on until the problem recurs.  (And if it is multi-homed, then play around with the binding order of protocols on the NICs.  That can solve a surprising number of things.)

Simplify the Network

While we're at the simplification stuff...

Any network that's been running for more than a few years contains a lot of software and a lot of hardware... but the longer a network runs, the more mainly useless hardware and software it accumulates.  Maybe it's time to finally turn off that Banyan VINES server.  And do we really use that dedicated Netgear print server doodad?  The last Windows 98 box is gone... it's probably okay to kill NetBEUI.  We haven't used that VPN in two years; let's bite the bullet and take its clunky client software off the laptops.

Fewer moving parts means cleaner operation and fewer things to break.  And while you're cleaning house, it's time to ...

Know Your Network

It's a Sunday in January, 2003.  A worm of some kind is loose on your network and it is saturating your LAN's bandwidth.  You put a network monitor on your network, and discover that the bad device is at IP address 10.4.198.33.  You and your co-worker exchange a look of triumph.  But, a half-second later, those looks fade as you each ask yourselves the same question:  which computer is that?

A network diagram really helps in this case.  Even the smallest firms will find some network documentation handy at one time or another.  Physical locations of systems, IP addresses, what software runs on them, what protocols run on them.  Locations of WAPs, hubs, routers, switches, as well as a pointer to wherever their drivers, configuration utilities or the like reside.

I know, this seems simple.  But there are two things to remember about this, and they're also simple -- but people forget them.  First, do this network documentation before a problem occurs.  And, second, be sure to have a copy or two that doesn't live on the network!

Isolate the Bad Component

Related to "simplify," I mean here to use clues to help zero in on the troublesome piece.  Does turning something off make the problem go away?  Does only one client have the problem?  Then focus on the client.  Do all clients have the problem?  Then focus on the server.  Does only one floor have the problem?  Check that floor's hubs, switches and routers.  Can you attach a new client and get it to work?  Then perhaps some hotfix or other upgrade got in the way.

And speaking of routers, hubs and switches...

Hardware Breaks... Even Reliable Hardware

In the 21st Century, we're sort of used to software being the source of most of our problems.  That's because in the thirty-odd years since microcomputers appeared, we've seen hardware get physically smaller.  It's gotten simpler in the sense that what once required a big gray box with a twelve-inch-square motherboard and eight add-in cards now appears on a three-by-four inch motherboard with no add-in cards, and that motherboard contains a small fraction of the number of discrete chips of the older one.  It doesn't draw as much power and is therefore cooler -- and therefore longer-lived.  We have also seen the slow disappearance of moving parts, that bete noir of reliability; when was the last time you had to align a floppy drive head?  It's true also that the increasingly low cost of chips allow hardware vendors to use dedicated computing devices to make unreliable physical devices like hard disk heads and platters more reliable through automatic error detecting and correcting systems.

I suspect that anyone keeping a log of computer problems would find that 95-plus percent of their problems were software-related rather than hardware-related.  Take out the inevitable problems you see in new hardware, forswear overclocking and home-brew combinations of random CPUs and motherboards and that 95 percent rises further.  Yup, nowadays, hardware is pretty good.

Which is, unfortunately, bad.  It predisposes us not to consider hardware as a source of network mysteries.  Here are a few examples.

But how to to quickly diagnose this kind of stuff?  That leads us to...

Have Spare Parts On Hand

All too often, hardware doesn't completely die, it just gets sick.  So it needs testing.  Now, you may know that in many industries you can purchase really nice test equipment from companies with names like Fluke and Agilent.  (Really expensive equipment too, by the way, but worth it in saved time.)  But there isn't, nor will there be any time soon, a big market for PC testing equipment.  Sure, you can buy tools to test Ethernet or IEEE 802.something networking.  But motherboard testers for your laptop or your RAID card aren't any time in the offing, mainly because the PC and PC networking market change so rapidly that by the time a piece of test equipment appeared for a given PC component, that component becomes irrelevant.  I should, however, parenthetically note that this lack of PC and PC networking test equipment will probably change now that the pace of change in PC hardware has slackened.  But that equipment might still end up priced out of the hands of many.

What, then, is often the least expensive piece of test equipment?  A spare part.  When I'd order a ton of some kind of equipment for a client, I'd often advise a client to buy an extra one.  If buying, say, 100 desktop computers, then it'd be nice to have a 101st on hand so that as we take the PCs out of the boxes and try them out, then we can quickly verify what ails the occasional troubled new PC by swapping parts from that extra PC.  As veteran PC troubleshooters often say, "swap 'til you drop."

Additionally, you may want to consider spares for anything that is a choke point in your network.  I used to have an Internet connection via a frame relay to my ISP.  (DSL doesn't go where I live.  Nor does cell reception beyond one bar.  Thank heavens for cable modem.)  The ISP was great (Continental VisiNet, www.visi.net) but I had to leave them because the frame relay was run by Verizon, who for some reason could not keep a simple 256 Kbps connection up for than about 80 percent of the time.  (The Verizon guys once told me that they had no idea why they'd bothered taking the job, as anything below a T1 was apparently beneath them.  I think that in my next life, I'm going to skip this silly small service business stuff and get me a monopoly.  Definitely.)

Anyway, I was connected to the world via a Cisco 1602 router and if lightning struck anywhere in the surrounding five towns, the Cisco would die.  (Yes, I installed every kind of lightning protection I could find.)  The best answer was to have another 1602 around, already configured to be swapped out while I sent the fried one out for repair.

Reboot!

I have made this comment with tongue-in-cheek for years, but it does bear some truth:  "the two most effective tools in the Microsoft world are 'reboot' and 'reinstall.'"  (I should mention, however, that XP's System Restore has drastically reduced the number of reinstalls that I've had to do to that product, and I can't wait to see System Restore come to Server in Longhorn.)

I remind you about rebooting because where once we just knew that anything more minor than changing the background color required a reboot, modern Windows can do an awful lot of things without needing a reboot.  Consider that you can take a vanilla copy of Server and add DHCP, WINS, DNS, IIS, and the majority of patches delivered for XP and 2003 in the past year... all without a reboot.

But many things do require a reboot, and if you've made a change to your system software and it hasn't quite "taken" yet, then give it a reboot, if you can.  ("If you can" because I know that some of you have annual bonuses tied to maintaining that "five nines" thing, and if I did the math right then 99.999 percent uptime means no more than about five minutes' downtime all year.)  See if that reboot helps things start working.

Group policies can require a reboot or two.  XP, in fact, has its own strange way of processing GPs such that some settings can take up to three reboots to take effect.  Group Policy Management Console, a free download from Microsoft and an optional component of Server R2, has a very nice set of reports that can help figure out why a GP setting hasn't taken effect.  (But don't try to run GPMC on an x64 system; .NET problems keep any of the Windows x64 builds from running GPMC.  Bummer.)

Hardware often needs "rebooting" after being reconfigured.  Routers, modems and the like won't always show the effects of reconfiguration until you actually power them off and on.

And speaking of reboots versus powering off and on, remember that if you are rebooting a system because you think that you've cleaned some kind virus, spyware or whatever off that system, then always shut the system down altogether, and then turn it back on, a so-called "cold boot."  It's possible to create a piece of software that can survive a warm boot, so a virus that you've cleaned off the hard disk might be lurking in RAM hoping for a warm boot -- and another chance to infect your hard disk.

Know What's Normal

Until I got rid of Verizon's unreliable frame relay, something called a Frame Relay Access Device (FRAD) sat in my office.  It was a beige box that contained six LEDs.  Each LED could be green, amber, red, or off.

The first time the frame behaved strangely, I glanced at the FRAD to see if the lights told me anything.  That's when I noticed that most of them weren't labeled, which was, well, disturbing.  Then I noticed that two of them were off -- just dead.  Two were green and two amber.  Of course, the first thing I thought was, "guess I'll have to call Verizon."

As it turned out, two greens and two ambers was bad, three greens and one amber was good.  The bottom two never ever lit up in the five years that I used the FRAD.  But it made me do what I should have done before:  make notes on "normal."  And here's a cheap way to keep track of what's "normal" on a datacomm box's LEDs... just get out your digital camera and take a picture.

Make One Change At A Time

This is sooo obvious, and soooo hard...

Let's say that you have to install something in your Web server -- perhaps a FireWire card.  But you've had some extra RAM lying around, so why not beef up the memory while you're at it?  And as long as the server's case is open, it really is the time to get some compressed air and blow out the dust bunnies, right?  And just maybe, since the cover's open ... NO!!!!

Let's be realistic about this:  machines are out to get us techies.  You know it, and I know it.  But they have to play by certain rules, and one of those rules is, "if the human changes something in me [the machine], then I'm really only allowed to break if I can create an excuse that's relevant to what the human changed."  In other words, if all you do is add the FireWire card and the system refuses to boot, then the chances are really good that simply removing the FireWire card will un-do the damage, unless you also dropped something into the case while installing the FireWire card.  In contrast, doing two, three, or four things all at once, then putting the case back on, gives the machine all kinds of plausible reasons to fail, and basically puts you in a position of having to pretty much reduce the system to parts and re-build it from there.  Ugh.

Now, I'm not saying not to add that RAM or blow out those dust bunnies... just that you'll be happiest if you do each of those jobs one at a time, then power up the system to ensure that the machine's not misbehaving, then power it down and make the next change.

Never Assume

The famous Felix Unger exclamation from an episode of The Odd Couple aside, assuming can get you in a lot of trouble.  When a big part of my network stopped working, recall that I never even imagined that it was the power brick on the Netgear hub would die, so I didn't think to look at it for precious minutes while I looked at what it seemed certain to be.  (I've forgotten what did seem certain at the time.  Now I suppose I'd assume that it could be the power brick, and so overlook a bad cable.  Well, I hope I wouldn't do that any more.)

Adopt a no-assumptions approach to troubleshooting.  Sure, you need to create an order in which to test things -- for example, you might decide that software breaks more often than hardware, and so you check the software first -- but never leave a component off the "things to check" list just because it's reliable.  It might just be reliable.  Really reliable.  Just not 100 percent reliable.

Get And Learn To Use a Network Monitor

Networks can seem sometimes like nothing more than piles of black boxes whose only outward signals are green, blue, red and amber LEDs.  (Minasi's first law of data communication devices is, "the more little lights on the data comm thingies, the better.")  But sometimes it'd be nice to just open up that cable and actually see the bits go by to understand what's going on.  That's where a network monitor, sniffer, whatever you want to call it, is valuable.  Programs of this type capture and analyze data transmitted across your network.  With them, you can actually see how, for example, your system gets its IP address from DHCP.

There's a simple piece of  network monitoring software that ships with every copy of Server, but it only shows traffic traveling to and from the server.  Other network monitor software uses what's called a "promiscuous" network capture driver, which means that you can capture not only your system's traffic, but any other traffic on the network.  (Actually, that only works if you've got Ethernet hubs rather than switches.  You can make switches "promiscuous," but it's not normal behavior.)  There's a free network sniffer called Ethereal from www.ethereal.com.  The Unix world has always had a command-line network sniffer called "tcpdump" and now we Windows types have one as well called WinDump that you can find at www.winpcap.org/windump.  The output is harder to read than the nicely-formatted stuff that a GUI sniffer produces but it's something of a standard and, besides, it's much easier to show tcpdump/windump output on a printed page than it is screen captures of some GUI sniffer.

Keep An External IP Address

When trying to fix many kinds of server problems, the ultimate connectivity test is often "can I get out to the Internet from inside our intranet?" or "can someone on the Internet get to our public servers?"  The only way to check the second of those two is with an IP address not connected to your network at all, an address that does not directly appear in any of your routing tables.  Here's a simple way to maintain such a thing:  get a data service for your cell phone and the USB cable that lets you connect the cell to your PC.  Then you can always get yourself an external IP address and ping away.  External e-mail addresses are helpful as well; I use a Hotmail address to test sending e-mail addresses to my internal e-mail account whenever testing some change to my internal e-mail server.

Double-Check Security and Permissions

Can't do something that you think you ought to do?  Then ask:  have you done something to "harden" the security of your system recently?

After Code Red, I took some time and hardened my IIS server, and I mean really hardened it.  About a year later, I wanted to learn how to use the Index Service so as to create a search engine for my newsletters.  But after two weeks' work, I still couldn't get the Index Service to do a blasted thing.  Finally, it occurred to me to build a fresh, out-of-the-box test system... and everything worked fine.  You see, in the process of "hardening" my Web site, I'd inadvertently removed the System account's permission to read my Web site.  The Index Service needed to index content inside the Web site, and, as the Index Service runs as System, it was denied the ability to read the files.

Sometimes you can detect this by auditing "processes" in the Security event log.  That may give you a clue about whether or not you're being defeated by your security measures!

Call the Outside Communications Service (telephone company, cable company, ISP) Last

They don't care if you have a problem; they get to charge you monthly whether you use those bits or not.  They assume that you're a moron and are completely prepared to lie to you just to get you off the phone.  (Many are actually timed as to how long they're on the phone and are rated higher if they can process more calls per hour rather than, say, customer satisfaction.  In other words, their bosses pay them to get you off the phone as soon as possible.)  What's that, you don't believe they'd lie?  Okay, how about a couple of real-life examples.

When getting DSL installed, the installer was late and when I talked to the dispatcher, she told me that he wasn't bringing a DSL modem even though I'd already been invoiced for one.  I was irritated.  "Don't worry," she told me, "you can use your current modem."  "My current DIAL UP modem," I asked.  "Yes," she said, continuing "you won't get as good a speed as you will with OUR DSL modem, but it'll still be many times faster than you're getting now."  Or how about the Charter Cable guy who told me that the 5 kbps throughput that I was seeing on my cable modem wasn't Charter's fault, it was the "phone company."  I was puzzled and asked what they had to do with it.  He explained to me that "the phone company runs the Internet."  Nope, I am not embellishing.  That's a direct quote, and the gentleman who said it did so in November 2005.

So before you call your provider, get your facts straight.  Can't ping out?  Then do some tracert commands to find out exactly where things fall down.  At an ISP that I used to work with, the tracert often failed at an IP address that I knew was in their router farm.  So after they'd suggested by their tone that I was an idiot and they were just waiting to show me how stupid I was, I'd say "hey, I ran a traceroute and it stops at IP address such-and-such.  Is that one of yours?"  And oddly enough, I got connected to a third-level immediately!

And before you do contact your provider, be sure to do what they'll tell you to do anyway -- pull the plug on the router, frame relay, cable modem, DSU/CSU or whatever, count ten and plug it back in.  (I find it fascinating that I pay for a business cable account and when I call to report that there's something wrong with my connection, they say "unplug the cable modem and plug it back in."  I say "why? It's connected to a UPS -- yes, that's a UPS, not an SPS -- and that's backed up with a generator.  The only reason that it'd need to be cycled would be if there were a serious problem in its design or firmware and if that's the case, then why are you using that brand of cable modem instead of ..." but then I realize that some arguments aren't won with logic alone, nor do I own any firearms, so I unplug the cable modem, count ten and plug it back in.  Oddly enough, that never fixes things.)  Being able to say "I've already cycled the power, disconnected my cable from the router and directly connected it to my laptop" and so on makes for faster service and doesn't let them put you on "ignore."  (Oops, I meant "hold.")

Walk Around the Block, or Explain the Problem to Someone

This is a great tip.  Honest.

We humans have got really good brains.  Sometimes, though, we just don't know how to use them.  Ever been faced with a problem that stumped you for a hour or two and, once you figure out the solution or are presented with the solution, you say "aw, heck, I knew that!"  Of course you did; the answer was there, you just didn't have a path to it -- I think of it as "now, if this neuron and that neuron in my brain were to get together and have lunch now and they..."  And besides, how many of you get to troubleshoot in a calm, relaxed, supportive environment?  Stress makes us ready to run from a predator or kill something that's trying to kill us first; it's not so good at making you able to troubleshoot group policy conflicts.

So remove the stress and massage your brain a bit.  I find that re-framing the problem in some way causes different parts of my brain to get involved.  For example, sitting down and writing the problem out on paper may cause your brain to find the answer when the problem seemed insoluble moments ago.  Explaining the problem to someone else causes all of those verbal neurons to wake up, and there must be a lot of them, because it's surprising how many times simply talking something out solves it, doesn't it?

Or get the big muscles moving by taking a walk around the block.  It gives you a chance to get some air, restate the problem, and see it from a different light.  Believe me, most problems don't get much worse in the ten minutes it'll take you to take in a short stroll.  And who knows, it may get the clients from breathing down your neck for a minute or two, releasing some of your CPU time from the "ohmigod ohmigod ohmigod..." loop and freeing it up for the problem-solving section.

I hope that with these suggestions that I've reminded you of some of your own stories, perhaps suggested a new approach or two, and possibly even brought a smile to your face once or twice.  Again, in this article I meant no ill will to anyone and in case I've not made it clear so far, I have no illusions about having all -- or even most -- of the answers.  I figure that if I can just remember not to repeat a mistake, then eventually I will have made every possible mistake... and then I'll be perfect.  Until then, however, I can't wait to hear your rules for troubleshooting network problems.  Thanks for reading!