MicroStrategy - Troubleshooting

mstr

http://community.microstrategy.com/t5/Server/TN19579-How-to-troubleshoot-problems-with-MicroStrategy/ta-p/179781
http://community.microstrategy.com/t5/Architect/TN13321-Important-diagnostics-used-for-capturing-SQL-for-various/ta-p/173863

  • Do a ping test
  • Make sure that the clock on these servers are synchronized
  • Put a large file on the Tomcat server. Use IE on the iServer to pull that file from Tomcat. Use wireshark to capture the traffic. It should show packet drop
  • netstat -s will show total count of retransmission
  • See if we can use tcpdump or another command line utility to capture the packets on each of these server for a day
  • See what ports are opened on each of these servers
  • If there is no other tools available, and if we cannot configure MicroStrategy to send the log files to one place, write a script (Java Program) that would be invoked by NRPE. This script, based on a parameter (the name of the log file), would read the content of that log file from where it previously left off and send it over the network. Create a mechanism so that we can view these log files, either by merging the log files together, or individually. If the content of the log file contains the word SEVERE, nagios should notify me.
  • Study the log file everyday
  • Read page 808 of the System Administration Guide version 9.2.1, section titled Using firewalls
  • Go back to previous log files, an pull out all the messages, and the time, and put them on a graph. Probably I should write a script that parse these log files and put them in the database. Anyway, I need to extract out the relevant message regarding socket, put them at the top of this page, search the Internet, and also identify the pattern.
  • Search the Internet for "socket timeout", "connection reset by peer", and other things that show up in the log file
  • Monitor the number of open sockets. See what is the maximum number of sockets that can be opened. Even though it is unlikely that we have exhausted our maximum number of sockets. This happened to me previously.

How to use tcpdump on window
How to use wireshark
How to capture network packets on window
Get a list of open ports on each box
Tomcat / MicroStrategy how to ignore "connection reset by peer" when the user press the stop button or on a slow network.

java.net.SocketException: Connection reset by peer: socket write error (probably the user press the stop button, or the user was on a slow network): 13
java.net.SocketException: Connection reset by peer: socket write error (CustomSSO is involved): 3
java.net.SocketException: Connection reset by peer: socket write error (CustomTask is involved): 15

Search the manuals to see if there is any settings related to timeout or keepalive
See if there is a setting within MicroStrategy that can control TCP keepalive / timeout. If there is no setting within MicroStrategy for controlling TCP keepalive, then we look into turning off firewalls, and tweaking TCP/IP settings on Windows.

http://pic.dhe.ibm.com/infocenter/tivihelp/v2r1/index.jsp?topic=%2Fcom.ibm.IBMDS.doc%2Ftuning94.htm
http://stackoverflow.com/questions/8176821/how-to-set-the-keep-alive-interval-for-winsock

http://www.starquest.com/Supportdocs/techStarLicense/SL002_TCPKeepAlive.shtml
http://blogs.technet.com/b/nettracer/archive/2010/06/03/things-that-you-may-want-to-know-about-tcp-keepalives.aspx
http://kb.globalscape.com/KnowledgebaseArticle10438.aspx?Keywords=keep+alive
http://support.esri.com/en/knowledgebase/techarticles/detail/25129

http://www.paessler.com/packet_loss
http://help.rr.com/hmsfaqs/e_packetloss.aspx
http://pic.dhe.ibm.com/infocenter/tivihelp/v2r1/index.jsp?topic=%2Fcom.ibm.IBMDS.doc%2Ftuning94.htm
http://www.symantec.com/business/support/index?page=content&id=TECH66444
http://stackoverflow.com/questions/1403097/socket-dis-connects-on-one-end-firewall
http://forum.xbmc.org/showthread.php?tid=81872

http://docs.oracle.com/javase/1.4.2/docs/api/java/net/SocketOptions.html
http://wiki.apache.org/hadoop/SocketTimeout
http://www.ethernut.de/nutwiki/Socket_Timeouts
http://publib.boulder.ibm.com/infocenter/adiehelp/v5r1m1/index.jsp?topic=%2Fcom.ibm.etools.ctc.ims.doc%2Fconcepts%2Fcimssocket.html
http://docs.oracle.com/javase/1.4.2/docs/api/java/net/Socket.html
http://www.tek-tips.com/viewthread.cfm?qid=1042039
http://stackoverflow.com/questions/1480236/does-a-tcp-socket-connection-have-a-keep-alive
http://stackoverflow.com/questions/10552591/java-socket-server-keep-alive-implementation
http://www.java-forums.org/networking/7592-how-configure-keep-alive-sockets.html
http://www.programmingforums.org/thread29488.html

http://www.starquest.com/Supportdocs/techStarLicense/SL002_TCPKeepAlive.shtml
http://blog.mafr.de/2010/03/14/tcp-for-low-latency-applications/
https://forums.oracle.com/forums/thread.jspa?threadID=1772765
Maybe we need to write a task that just connect to the IServer, do nothing, and return. This task should runs every 15 minutes.
https://resource.microstrategy.com/Forum/ReplyListPage.aspx?id=12698
https://resource.microstrategy.com/forum/ReplyListPage.aspx?id=19456
https://resource.microstrategy.com/forum/ReplyListPage.aspx?id=8705
https://resource.microstrategy.com/forum/ReplyListPage.aspx?id=4717
http://164.77.160.147/microstrategy/help/WebAdmin/WebHelp/Lang_1033/Default_server_properties.htm
http://www.microstrategyblog.com/2010/02/time-out-tune-the-project/
http://www.tek-tips.com/viewthread.cfm?qid=1047144
http://businessintelligence.ittoolbox.com/groups/technical-functional/microstrategy-l/weird-time-out-when-running-reports-1284966
http://www.bryanbrandow.com/2011/03/getting-started-with-microstrategy.html
http://211.171.208.120/OLAP/asp/Main.aspx?pg=adminHelp&adminHelp=1&subpage=Adm_ServerDefPro.htm
https://resource.microstrategy.com/forum/ReplyListPage.aspx?id=10996
http://docs.oracle.com/cd/E19528-01/819-0992/ds-tcp-settings/index.html
http://onlamp.com/onlamp/2005/11/17/tcp_tuning.html
http://content.gpwiki.org/index.php/Java:Tutorials:Simple_TCP_Networking
http://code.google.com/p/hazelcast/wiki/ConfigFullTcpIp
http://www.techsupportalert.com/best-free-tcp-settings-tweaker.htm

http://www.ecr6.ohio-state.edu/window-scaling.html
http://www.stoufis.gr/blog/topics/709
http://kb.globalscape.com/KnowledgebaseArticle10438.aspx
http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r5/index.jsp?topic=%2Fcom.ibm.swg.im.iis.productization.iisinfsv.install.doc%2Ftopics%2Fwsisinst_config_winregtcpip.html
http://ltxfaq.custhelp.com/app/answers/detail/a_id/690/~/setting-the-tcp-keepalive-timer-in-windows-nt,-2000-and-xp
http://blogs.technet.com/b/nettracer/archive/2010/06/03/things-that-you-may-want-to-know-about-tcp-keepalives.aspx
http://stackoverflow.com/questions/12248132/how-to-change-tcp-keepalive-timer-using-python-script
http://support.microsoft.com/kb/158474
http://www.bvanleeuwen.nl/faq/?p=170
http://kb.globalscape.com/KnowledgebaseArticle10438.aspx?Keywords=keep+alive
http://www.pctools.com/guides/registry/detail/891/
http://support.esri.com/en/knowledgebase/techarticles/detail/25129

http://board.jdownloader.org/showthread.php?t=5967

What happens if we have two Intelligence servers which are not clustered? If nodes are manually removed from the cluster, projects are treated as separate in MicroStrategy Web, and the node connected to will depend on which project is selected. However, all projects are still accessing the same metadata.

  • CPU utilization
  • Memory utilization
  • Load average
  • Disk space usage on each drive
  • See what metrics we are currently capturing for CC
  • Additional metrics that we can obtain from Java, Tomcat, or IServer (number of currently connected users, number of reports recently run, how long each query takes, etc)
  • We need to monitor the performance of the database server too. This include the production database, the metadata database, the statistics database, the history database.
  • We need to know which queries are running slow.
  • user activity, data warehouse activity, report SQL, system performance
  • Look at the metrics that we are capturing for MySQL
  • Network statistics

Diagnostics and Performance Logging Tool
Enterprise Manager

Purge the change journal regularly
Purge the statistics database regularly

How frequent should we purge the statistics database? How can measure the size of the statistics database?

Stop the IServers on staging

Talk to Jun about:

  1. Current setup: whether there is a firewall between IServers and database, whether there is a firewall between MicroStrategy Web and database, whether there is a firewall between MicroStrategy Web and IServers. What ports are currently opened?
  2. Is it possible for me and other team members to gain visibility into current resource utilization on these MicroStrategy servers? Can I have access to the tool that we are currently using for monitoring? If not, who do I need to work with in order to graph all these metrics, and put these graphs into a dashboard for easy debugging.
  3. What software will we be using for the log server? What are we going to use for aggregating or streaming of log data?
  4. How firewall work with VMware? I understand the traditional physical environment, but how do firewall work in a virtualized environment?
  5. Is there any firewalls like Windows Defender or Kaspersky AntiVirus running on each of these servers? Can we disable those for a few days to see if the problem go away?
  6. Can we disable the firewalls that are running on each of these servers for a few days to see if the problem goes away?
  7. Consider the situation with SSH. When we SSH into a Linux box via PuTTY or some other software, we get connection timeout or get disconnected a lot. It is because there is a firewall in between. For SSH, we can adjust the keep alive setting either on the server or inside PuTTY. Why do firewall kill the SSH session? If there are any firewalls involved between MicroStrategy Web and IServers, including potentially any firewall that are running on each of these servers, can we adjust this setting on all those involved firewalls?
  8. I do not mean to insult anyone, but I am asking that you put on your thinking cap, and think beyond the obvious places, the places that we previously looked at. What else can cause this problem? Do we have any firewall running on each of these server? Is there any settings on these firewalls that we can adjust?
  9. A tour of the data center may be helpful.

For MicroStrategy Web (Tomcat), where is the configuration file for logging?

webapps\MicroStrategy\WEB-INF\xml\logging.properties

Does Tomcat has an interface where we can change the log level without restarting Tomcat?

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License