Showing posts with label Bug. Show all posts
Showing posts with label Bug. Show all posts

Tuesday, June 8, 2010

Install Microsoft Dynamics AX Enterprise Portal Server

Last week, I finally decided to look at the installation of the Enterprise Portal (based on Windows SharePoint Services 3.0) for Dynamics AX 4.0. I therefore started to investigate about its installation and configuration. Microsoft provides a very complete document about it: Install and Configure a Microsoft Dynamics AX Enterprise Portal Server.

The instructions are quite detailed and can be followed as is. However there is no guarantee that everything will happen by the book. This is why before configuring and deploying the Enterprise Portal itself, I decided to check if all previous settings change did not prevent WSS to run. I checked the WSS Central Administration and it was full of "SharePoint encountered an unknown error." preventing me to view any page.

I decided to run the SharePoint Configuration Manager but it did not solve the issue. A quick look at the Event Viewer revealed hordes of different nasty errors under both the Application tab and the System tab: error 10016, 5214, 18056, 3351, 2424, 6611, 2426, 110, and 8214.

The solution was to concentrate on the errors from the System tab by order of appearance starting with the first one (Error 10016). As you can see, the error consisted of a service account requiring additional security permission on a Component Service administrative tool. Ultimately the solution was to add the Network Service account as user authorized to start that Component Service component service. The additional trick here was to identify the right Component Service as error 10016 message only referred to the CLSID of the Component Service and not to its full and clear name. Happily Google quickly provided the right component service name: IIS_WAMREG. The operation had to be repeated as well for the Business Connector account (as described in the installation document).

Once both accounts' permissions were set up, all errors stopped occurring and WSS was running correctly. It was then time to start with the configuration and deployment of the Enterprise Portal itself. I launched the Dynamics AX client and went to the Enterprise Portal setup in the Administration panel.

The Enterprise Configuration wizard is pretty straightforward and I have no particular comment about it. Upon completion, it proposes to launch the "Manage deployments" wizard. That wizard is slightly different than what is described in the documentation but it serves the same purpose.

However when running it, we had a small error upon completion. That error was unhelpfully logged as event ID 1000 in the Event viewer. On top of that, the EP custom site template was not deployed in SharePoint and this alone completely prevented the creation of the EP site.

The solution came from the following article from Customer Source (Article ID 940365).

Before finding the solution I tried a few time to remove and redeploy the Enterprise Portal. During these try-and-fail tentatives, I discovered that:
  1. Clicking on the Remove button (see below screenshot) while any site was selected would immediately and automatically crash the AX client.
  2. To remove the Enterprise Portal, you need to first double click on all ticked boxes then click the remove button.



After applying the fix from Customer Source, I still had the AX error, still the crash-upon-remove behaviour but the EP custom templates were correctly deployed. I could finally create successfully my first Enterprise Portal site.

After its creation, the first step is to link the EP site to a company from Dynamics AX. And of course the first tentative failed miserably. This time the solution came from article ID 931939, still from Customer Source. Once the solution got applied, I could link a Dynamics AX company to the EP site and start checking all its nice features.

Monday, April 26, 2010

Evolution of Star Wars Combine Development Process

That's it! We just migrated the Star Wars Combine online game to our new servers. We perform such an upgrade of hardware every 3 years roughly to cope with growth of both users and resources. We also use these upgrades as an opportunity to improve our development processes. Or maybe it is the opposite: we decide to update our development processes and we upgrade our hardware at the same time.

To make it short when we we started the Combine back in 1998, all we had as environment was a few text flat files as database, some HTML files for the website and a downloadable client developed by a single person who uploaded the compiled software on the server so that our players could download it. Back then the server was in fact just a small account on a free hosting services whose name I have long forgotten. Do not forget that we are just a bunch of volunteers working on this project during their free time and for free (or least for the sake of code).

Over time, we switched from a client/server model to a pure web-based model. Our architecture became clearer in our mind and since 2006 we were using 2 servers: one as production environment and one as development environment. Since a couple of years, we even decided to release our new features and bugfixes as weekly patches, allowing us to fix most problems on the development environment before going live.

This was already an improvement but we still had several issues regarding code quality due to our developers working in parallel and most of them having very different programming levels and background.

Now, since our recent Development Meeting of last February in Berlin, a new process has been introduced along with new tools. First we have created a virtual machine our server so that developers can work locally on their repository. That VM includes the database and the web server. When they perform some changes, all they need is to save their file and refresh the concerned VM webpage. Once they are happy with their changes, they can commit them to the development server.

On the development server, we have testers performing functional tests and providing input to the developers. On top of the functional testing, we have some unit tests that ensure that new developments will not break existing code (and features). Then, once all our tests are passed, we can proceed with our weekly release. Database schema changes are pushed and the production code is updated.

Even if this new process require more efforts from the developers such as writing unit tests and spending more time in the testing phase, the middle-term and long-term benefits of this new process will be invaluable. We are creating today a way to decrease tomorrow's bugs.

At the same time, we introduced a powerful database dedicated server to cope of the increasing database demands generated by our natural growth and increasing game complexity. Our frond-end web server can now only take care of generating webpages and will so open us the gate for future load-balancing.

Friday, February 26, 2010

Google Latitude can Go Wrong !

Yesterday, I was coming back from a business trip in France. During the travel back, the Thalys train shuttling me from Paris to Brussels suddenly came to a total stop. I wanted to know where we were so that I could estimate the remaining distance.

I looked at my Blackberry and could already see from the telecom operator name that we were in Belgium already. I then decided to use the Google Latitude application to get more information. Here is what I got:


And I got the address as well, confirming that Google really located me north of Vilvoorde and that it was not just a display error:
Where you know the Thalys, you know that you arrive to Brussels from the south and so you cannot be on the north. A few minutes later, I had the confirmation that we were well in the south as we crossed the Halle (Buizingen) train station. As you can see from the Google map below, there is quite around 40km between my reported position (B), and my actual position (A).


Moreover when refreshing my position, I noticed that while the train was driving north toward Brussels, my position was moving from north to south, also toward Brussels.

Who said that Google is always right ?
There is always one more bug I would say.

Friday, January 15, 2010

HRESULT: 0×80040E14 error when adding items to SharePoint

When attempting to create a new folder in our SharePoint, I faced the following error message: Exception from HRESULT: 0x80040E14 with an impressive stack filling in my entire screen.

Googling this error message will send you to the Microsoft KB 841216 as first result. However the 3rd result will display an alternative solution from Alex Pearce's SharePoint blog. Alex suggest that your the reason can be in fact totally trivial and due to:

  1. Your SQL server data drive being full, or
  2. Your data or log file reached there full allocated size, or
  3. Your database size is full and not set to increase.
In my case, the log file was full and I simply needed to shrink it. Kind of stupid, no ?

Monday, August 31, 2009

Upgrading MS BI to SQL 2008: Do not underestimate the Power of the Dark Side

For our BI setup running under Microsoft BI 2005, we have an environment that comes straights from scholar manual. We have a development environment (DEV), composed of 4 servers: two are SQL 2005 database servers (staging database and a data warehouse ), one server is a SharePoint 2007 server hosting also the SharePoint database (SQL 2005) and a Visual Studio server. Similarly, we have a testing environment (TEST) composed of 3 servers: a staging database, a data warehouse, and a SharePoint server hosting the SharePoint database as well. Both the DEV and the TEST environments are virtualized. Then we have our production environment (PROD) composed of 4 physical servers: a staging server, a data warehouse, a SharePoint server, and a SharePoint database server. The only difference being that in production, we have separated the SharePoint front layer from the database layer for performance reasons.

Developers develop in the DEV environment and technical approvals are done there. Testers test in the TEST environment, which is also used for the functional approvals. Production data run in the production environment. So far so good and nothing new. I just described an out-of-the-manual architectural setup. All databases are up-to-date SQL 2005 and the SharePoint version is 2007.

We had planned to upgrade this environment to Microsoft SQL 2008 and thought it was quite well planned. We had waited for SQL 2008 SP1 to be released. We had waited for our sub-contractor to accumulate some upgrade experience and finally decided to migrate progressively. During last July, we migrated our DEV environment. Everything went smoothly. Some of us then went on vacations. The migration of our TEST environment was planned for mid-August and took place as expected. All functional tests we had prepared were passed without any problem. As this point, and as all lights were green, I decided to authorize the migration of the PROD environment.

And guess what ? Everything went fine with only one unexplained manual reboot. We later even noticed a small but noticeable performance gain with SQL 2008.

On the next days, I got an alarming phone call from the IT operations department. They reported that since the SQL upgrade all production servers kept rebooting every 2 to 12 hours without apparent reasons.

We then immediately started our investigations but quickly discovered that there was no logged Windows event or error message. A closeup monitoring of SQL did not give us any clue. A couple of days were necessary to find out that the reboots were caused by the monitoring systems because the servers were freezing for more than 10 minutes. Still, we had no idea why they were freezing. It only happened on the physical PROD servers, not on the virtual DEV or TEST servers.

The main difference between virtual and physical machines concerns the underlying drivers. After a couple of days, we upgraded several drivers on one of the blade servers: HBA StorPot card, Ilo2 card firmware, blade power management control and various other minor drivers. It worked and fixed the problem.

Conclusion of the story ? You cannot foresee everything and even if you work by the book, shit may happen. The good news is that we only lost 3 days with a very limited impact on end-users.

Friday, January 2, 2009

Microsoft Zune Dies on New Year

Many Microsoft Zune music players went out of use during the New Year's night. It has been observed that these MP3 players were programmed to work for 365 days in 2008 instead of 366 as 2008 is a leap year. 2008 was a day too long for Zune. Complaints have been flooding on every Zune user's forums.

According to Microsoft, a bug linked to the date system is the root cause. The internal clock driver did not take into account the 2008 leap year and based itself on a normal 365-day long year. Still according to Microsoft, these start-up problems only concern the 30 Go Zune version launched during 2006.

Similar errors are more frequently observed during year changes. The most famous date-related software bug was of course the Y2K Bug which generated a lot of fear for very few damages.

Microsoft pointed that the Zune problem should disappear by itself. In order to fix the problem, Zune users have simply to completely discharge their Zune players before charging it again; a simple cold start for new year in fact.

Monday, February 11, 2008

Overloading Second Life

Today, I learned something funny from Kristian Köhntopp, a Principal consultant from MySQL Berlin. Second Life, the famous online virtual world, uses a partitioning system based on the geographical location of its universe. In short, this means that each server supporting Second Life only supports a small area of the universe. If more than 200 players gathers in the same room or building or island, then you have a serious chance to see it crash.

Why would it be this funny ?
Simply because I wrote my Master thesis on Networked Virtual Environments back in 2001-2002 back when I was still studying at the Free University of Brussels. For it, I describe a model of partitioning which was also based on the geographical locations and I even wrote a small prototype with two servers using my own online game the Star Wars Combine.

We live in such a small world ...