Radosław Stankiewicz: 2014

środa, 9 kwietnia 2014

Custom Trace in WebSphere Message Broker 7

It happens that standard console, syslog or eventviewer is not enough to understand an error in WebSphere Message Broker. It happened to me recently when implementing ws-security in WMB 7.0.0.1.
Long story short I faced an CWWSS6521E error with huge stacktrace and funny exception

com.ibm.wsspi.wssecurity.core.SoapSecurityException: CWWSS6521E: Logowanie nie powiodło się z powodu wyjątku.: javax.security.auth.login.LoginException: caught exception from broker

There was no place in Broker where I could easily look for more details. Even service trace in official debug mode didn't show anything more detailed than this stacktrace.
My coworker has sent me this link to one of possible reasons but stacktrace was different so I had to be sure before I install additional fixes.
Here is solution:

Add to execution group additional flag pointing a file: -DtraceSettingsFile=MyTraceSettings.properties
Create file MyTraceSettings.properties with body: com.ibm.ws.wssecurity.*=all=enabled

This flag will configure trace to store tons of debug information and finally will show more detailed exception in debug:

Found an extranious X509 token - more than configured for PolicySet

For better undestanding those debugs jars responsible for ws-security are: ws-security-impl-1.0-SNAPSHOT.jar or com.ibm.jaxws.thinclient_7.0.0 depending on version of WMB.

I had upgrade my broker to version 7.0.0.6 and it fixed my errors.

wtorek, 18 marca 2014

Udacity Data Wrangling with Mongo DB: Las Vegas exercice

For final project I have chosen Las Vegas region - it was one of my tour point during my last holidays. I remember what was the process of choosing the hotel - We have opened one of the booking sites and searched for good prices and reviews.

I wanted to try different approach - choose hotel & casino based on neighborhood, how many other casinos and hotels are in 10 min walk - 500m radius.

Some information about data provided into mongodb - ideal information, how it should look like:
{
"id": "2406124091",
"type: "node",
"visible":"true",
"created": {
"version":"2",
"changeset":"17206049",
"timestamp":"2013-08-03T16:43:42Z",
"user":"linuxUser16",
"uid":"1219059"
},
"pos": [41.9757030, -87.6921867],
"address": {
"housenumber": "5157",
"postcode": "60625",
"street": "North Lincoln Ave"
},
"amenity": "restaurant",
"cuisine": "mexican",
"name": "La Cabana De Don Luis",
"phone": "1 (773)-271-5176"
}
what I really had:

{
"building": "yes",
"website": "http://www.caesarspalace.com",
"amenity": "casino",
"node_refs": [
"389482445",
"1483478762",
"1483478753",
[...]
"389482448",
"389482445"
],
"gnis:county_name": "Clark",
"created": {
"uid": "336460",
"changeset": "9675966",
"version": "8",
"user": "robgeb",
"timestamp": "2011-10-28T12:11:39Z"
},
"tourism": "hotel",
"wheelchair": "yes",
"wikipedia": "en:Caesars Palace",
"ele": "644",
"visible": null,
"address": {
"city": "Las Vegas",
"county": "Clark",
"state": "NV",
"street": "Las Vegas Boulevard",
"postcode": "89109",
"housenumber": "3570"
},
"gnis:feature_id": "2472987",
"type": "way",
"id": "115672893",
"name": "Caesars Hotel and Casino"
}

More complex nodes (buildings, ways) don't have position, they reference to other nodes responsible mostly for having only position - for example 4 nodes, one for each corner of the building.

MongoDB doesn't support joins, so in order to query for location of hotel I need to collect locations first.

I have created script - for each node with node_refs I iterate over array and create array of locations. I don't want to give one specific location because it generally invalid for roads and long buildings. MongoDB 'near' function in aggregate pipeline supports array for filtering, but doesn't support as a center location. Here is the script:

p = db.lv.find({'node_refs':{'$exists':1}});
for el in p:
#lets add some details
points = [];

if 'pos_many' in el:
continue;

for ref in el['node_refs']:
one = db.lv.find_one({'id':ref});
if one is None:
continue;
if 'pos' in one:
points.append(one['pos']);
if not 'is_referenced' in one:
one['is_referenced'] = 1;
db.lv.save(one);
el['pos_many'] = points;
db.lv.save(el);
Apart of updating nodes with locations, my script flags nodes which were referenced. I would like to see what kind of nodes are referenced, are there only locations or can I find for example reference to bus stop or tram station. I could filter/delete base on this flag.

Script has updated more than 70 000 objects with references, almost 678 000 nodes which were referenced. Only 72 were named nodes like tram station so I can't delete those nodes but I won't loose to much information if I filter this data.

Now I can run query to filter all the casinos in Las Vegas region:

db.lv.find({'$or':[{'name': '/Casino/'},{'amenity':'casino'}]})

It gives me more than 50 casinos, some of them are known to me.
For each casino now I can query:
db.lv.aggregate([
{
'$geoNear': {
'near': pos,
'distanceField': "dist.calculated",
'maxDistance': 0.5/111.12,
'query': {'id':{'$ne':el['id']},'$or':[{'name': '/Casino/'},{'amenity':'casino'}]},
'includeLocs': "dist.location",
'uniqueDocs': 1

}
}
]);

Result of this query for is this table - Top 10 casinos:

casino	# of casinos nearby
Bill's Gamblin' Hall & Saloon	8
Bellagio Hotel and Casino	8
Imperial Palace Hotel and Casino	7
Flamingo Hotel and Casino	7
Harrah's Hotel and Casino	7
Tropicana Hotel and Casino	6
Paris Hotel and Casino	6
Caesars Hotel and Casino	6
Excalibur Hotel and Casino	5

List of nearby casinos for top 2 [name , distance in meters, location (lat,lon)]:

Bill's Gamblin' Hall & Saloon:

Flamingo Hotel and Casino 63.6570179247 [36.1154373, -115.1723441]
Bellagio Hotel and Casino 105.992429283 [36.1143679, -115.1733477]
Caesars Hotel and Casino 206.917388636 [36.1154496, -115.1743421]
Paris Hotel and Casino 209.970162638 [36.1130181, -115.1725101]
Bally's Hotel and Casino 229.545641554 [36.1143607, -115.1705686]
Imperial Palace Hotel and Casino 334.493011664 [36.1179157, -115.1726557]
Harrah's Hotel and Casino 397.143685884 [36.118481, -115.172568]
Planet Hollywood Hotel and Casino 466.059119841 [36.1109562, -115.1711528]

Bellagio Hotel and Casino:

Bill's Gamblin' Hall & Saloon 109.676718942 [36.114907, -115.1725608]
Caesars Hotel and Casino 144.503466085 [36.1150138, -115.1744618]
Flamingo Hotel and Casino 165.401843786 [36.1155678, -115.1725383]
Paris Hotel and Casino 173.189321322 [36.1130181, -115.1725101]
Bally's Hotel and Casino 280.117818406 [36.1136115, -115.1709408]
Imperial Palace Hotel and Casino 406.50188833 [36.1179157, -115.1726557]
Planet Hollywood Hotel and Casino 447.491259701 [36.1109562, -115.1711528]
Harrah's Hotel and Casino 470.026846719 [36.118481, -115.172568]

Both hotels are located in a very center of city, on the corners of Las Vegas Boulevard and Flamingo Road. What I couldn't verify in this dataset is that Bill's Gamblin' Hall & Saloon is currently closed.

Known issues:

When choosing one location of hotel I have chosen first location in array (on of the corners), you can notice this on picture above. It should be center of location.

czwartek, 13 lutego 2014

Content-Security-Policy issues with iOS Chrome

Few days ago I had an opportunity to trace an issue with iOS Chrome not loading a page. The page with all resources were downloaded properly but Chrome was constantly showing that it's still working on loading page. Result is lack of 'on load' events. Problem only occurred when reloading site. Copying whole content to local static web server didn't replicate the issue so it wasn't the problem of content. I was able to cut whole page and return simple 'hello world' page and it turns out that problem still exist on original webserver and it looks like I had to look deeper - http headers. I had created a sample webserver to show the problem I have found:

import time
import BaseHTTPServer

class HTTPHandler(BaseHTTPServer.BaseHTTPRequestHandler):
def do_POST(s):
length = int(s.headers['Content-Length'])
print length
data = s.rfile.read(length).decode('utf-8')
print data
def do_GET(s):
s.send_response(200)
s.send_header("Content-Security-Policy", "script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; object-src 'self'; img-src 'self' ; media-src 'self'; frame-src 'self'; font-src 'self' ;connect-src 'self'; report-uri '192.168.43.17/report'")
s.send_header("Content-Type", "text/html;charset=UTF-8")
s.end_headers()
s.wfile.write("<html><head><title>hello</title></head><body><p>hello world %s</p></body></html>"% s.path)
if __name__ == '__main__':
server_class = BaseHTTPServer.HTTPServer
httpd = server_class(('192.168.43.17', 80), HTTPHandler)
try:
httpd.serve_forever()
except KeyboardInterrupt:
pass

httpd.server_close()

There is only one not ordinary element here - CSP header which secures site from cross site scripting and give mechanism of reporting security violations. It looks like Chrome is reporting problems - it violates directives:
1) frame-src with uri: chromeinvoke://cd931b8a0ca6aaed193d25b429ee4019
"csp-report":{
"document-uri": "http://192.168.43.17/",
"referrer": "",
"violated-directive": "frame-src 'self'",
"original-policy": "script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; object-src 'self'; img-src 'self' ; media-src 'self'; frame-src 'self'; font-src 'self' ;connect-src 'self'; report-uri '192.168.43.17/report'",
"blocked-uri": "chromeinvoke://cd931b8a0ca6aaed193d25b429ee4019",
"source-file": "http://192.168.43.17/",
"line-number": 1
}
2) connect-src with uri: https://localhost
"csp-report":{
"document-uri": "http://192.168.43.17/",
"referrer": "",
"violated-directive": "connect-src 'self'",
"original-policy": "script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; object-src 'self'; img-src 'self' ; media-src 'self'; frame-src 'self'; font-src 'self' ;connect-src 'self'; report-uri '192.168.43.17/report'",
"blocked-uri": "https://localhost",
"source-file": "http://192.168.43.17/",
"line-number": 1
}
3) violations of frame-src with uri: chromenull://
"csp-report":{
"document-uri": "http://192.168.43.17/",
"referrer": "",
"violated-directive": "frame-src 'self'",
"original-policy": "script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; object-src 'self'; img-src 'self' ; media-src 'self'; frame-src 'self'; font-src 'self' ;connect-src 'self'; report-uri '192.168.43.17/report'",
"blocked-uri": "chromenull://",
"source-file": "http://192.168.43.17/",
"line-number": 21
}
4) frame-src with uri: chromeinvokeimmediate://3726692da42473af155b530fe0e48c61
"csp-report":{
"document-uri": "http://192.168.43.17/",
"referrer": "",
"violated-directive": "frame-src 'self'",
"original-policy": "script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; object-src 'self'; img-src 'self' ; media-src 'self'; frame-src 'self'; font-src 'self' ;connect-src 'self'; report-uri '192.168.43.17/report'",
"blocked-uri": "chromeinvokeimmediate://3726692da42473af155b530fe0e48c61",
"source-file": "http://192.168.43.17/",
"line-number": 2
}

Further investigation shown that:
Issue with reporting internal/plugins url is known, it is already submitted here.
Changing frame-src from 'self' to * solves loading site issue but is lowering security.
Interesting fact is that when switching from anonymous mode to normal I can notice for a short time an iframe:

wtorek, 4 lutego 2014

Clustering Udacity forum users

One of the questions I wanted to ask is can I cluster users into some groups. For clustering I wanted to use kmeans.

First I had to prepare simple export.

Mapper takes forum and user files and selects proper data from them:

import sys

import csv

def mapper():

reader = csv.reader(sys.stdin, delimiter='\t')

writer = csv.writer(sys.stdout, delimiter='\t', quotechar='"', quoting=csv.QUOTE_ALL)

for line in reader:

if line[0]=="id" or line[0]=="user_ptr_id":

continue;

if len(line)==5:

l = (line[0],'A',line[1],line[2],line[3],line[4]);

writer.writerow(l)

else:

l =(line[3],'B')

writer.writerow(l)

def main():

import StringIO

mapper()

sys.stdin = sys.__stdin__

main()

Reducer which outputs userid along with his badges, karma and posts count:

#!/usr/bin/python

import sys

import csv

def reducer():

oldKey = None;

rep=0

gold=0

silver=0

bronze=0

count = 0

reader = csv.reader(sys.stdin, delimiter='\t')

for line in reader:

if line[1]=='A':

if oldKey:

print '\t'.join([oldKey,rep,gold,silver,bronze,str(count)])

oldKey, rep, gold, silver, bronze = line[0],line[2],line[3],line[4],line[5]

count=0

else:#B

count+=1

if oldKey:

print '\t'.join([oldKey,rep,gold,silver,bronze,str(count)])

def main():

import StringIO

reducer()

if __name__ == "__main__":

main()

I have used Java Modelling Tools (http://jmt.sourceforge.net/) to visualize k-means clustering and it looks like that we can split our users into 3 clusters where:

17432 users (99%) are in cluster 1, red:

Info	Center	Std. Dev.	Kurt.	Skew.
Reputation	111.457E0	273.317E0	330.457E-1	518.127E-2
Gold	270.996E-3	930.424E-3	868.928E-1	717.735E-2
Silver	878.499E-3	244.692E-2	874.737E-1	683.432E-2
Bronze	421.489E-2	613.437E-2	843.112E-1	584.215E-2
Count	823.078E-2	199.274E-1	820.611E-1	723.129E-2

Cluster 2, 157 users, blue:

Info	Center	Std. Dev.	Kurt.	Skew.
Reputation	555.198E1	267.607E1	284.950E-2	166.334E-2
Gold	712.739E-2	919.289E-2	112.543E-1	272.010E-2
Silver	211.210E-1	204.831E-1	512.309E-2	191.835E-2
Bronze	511.529E-1	324.686E-1	260.647E-2	133.968E-2
Count	302.185E0	238.706E0	122.783E-1	252.269E-2

Cluster 3, 18 users, pink:

Info	Center	Std. Dev.	Kurt.	Skew.
Reputation	267.654E2	105.582E2	123.350E-2	143.433E-2
Gold	242.222E-1	307.142E-1	145.243E-2	154.269E-2
Silver	846.111E-1	768.483E-1	-537.560E-3	863.259E-3
Bronze	134.889E0	103.588E0	-104.684E-2	678.600E-3
Count	760.833E0	622.366E0	-159.373E-2	331.745E-3

Plotting those 3 cluster against two main variables we receive this image:

y-axis – number of posts

x-axis - reputation

I can see that most of the users are not active and there is very small group which helps a lot.