Archive for the ‘Software Product Development’ Category

31
Dec

Believe me, developing in Ruby language is as fantastic and intuitive as the title sounds. [:)] Jokes apart, if you are in to web development and RoR has not crossed your radar yet, I would like to take the liberty to aver that you are significantly out-of-date. Fortunately though, if you keep reading this blog, you will certainly get a glimpse of the revolution in web development that open source communities have brought.

What can Ruby and Rails do for an organization?

What can any language-framework pair do for an organization? If there are answers popping up in your mind, be kind enough to consider the same for Ruby on Rails. RoR can do all that in a much easier, faster and cost effective way. On the other hand if you don’t know the answers yet, here is a glimpse of what RoR can do:

·         Build customer centric small/medium websites

·         Deploy a web product with advanced (web 2.0 -ish) features in a very limited time.

·         Create automated testing frameworks

·         Legacy Application extension, integration and migration

Why only Ruby-on-rails?

This is what the open source community has to say:

·Productivity gains: exponential savings because of smaller teams, more productivity, improved time to market.

· Easy development, easy maintenance : Lesser lines of code (10 times less than Java) , makes development faster and maintenance easier

· Agility: Rails is Agile, no matter how confused you are on your business concept, rails will help you visualize. It adapts to your changing requirements quickly and easily.

· Free: Its free and open source, can’t talk more on cost effectiveness.

· Powerful: Its can be simply put as what you SEEK is what you get.

Not Convinced? Check out RoR comparison with PHP, Java technologies, Perl.

For the more technically demanding soul, here is how ruby on rails score over others:

· Support for Representational State Transfer (REST) architecture

· SOA-like integration with enterprise systems

· Convention over configuration framework simplifies data management

· Don’t Repeat Yourself or DRY model based on Ruby’s inherent ability to provide Domain specific languages

· MVC (model view controller) design pattern

· Built-in testing at every level

· Capistrano / ActiveRecord migration

· AJAX support in any framework; runs on any Unix open source platform

What has RoR done ?

Testimonials: Some live projects singing the success story for ruby on rails.

Ø Amazon.com: Yes! The same Amazon.com

Ø Basecamp: project management tool by 37 signals

Ø Bixee: An upcoming Indian job portal

Ø BharathRentals: India’s car rental service

Ø Dimewise: personal finance management

Ø HBO Asia: HBO’s Asian broadcast online

Ø Scribd: online document sharing and publishing

Ø Shopify: e-commerce

Ø simplifyMD: digital chart room for small hospitals

Ø Twitter: Online community and social networking

, , ,

16
Jun

In this blog we are going to introduce Agile methodology of software development. We would traverse through the Principles and Processes in Agile and Outcome of “Going Agile”.

Agile Methodology

Agile involves the best practices of Waterfall, Prototyping and Spiral methodologies. It is more focused on short cycles of build and release similar to Spiral methodology and each cycle undergoes through Waterfall process of planning and analysis, design, implementation and testing. Any task unit that is developed undergoes multiple refactoring and polishing similar to Prototype methodology until it evolves into a finished product or a feature.

Agile promotes teamwork and collaboration, encourages frequent evaluation, reduces risks and promotes adaptability to new requirements and company goals. Agile methods adopt planning that is adapted to accommodate inevitable changes early on into the project as compared to predictable planning methods that resists changes and thus suffer consequences between static plans and dynamic reality.

Agile is based on the following key principles:

  • Active user involvement:

The stakeholders – for whom the system is built, are involved in the development of such a system in defining requirements and review processes.

  • Empowerment of the team

Agile team is almost independent in making decisions in terms of ownership of the task and its estimation and design including what is being delivered at the end of the iteration is also agreed upon. The onus of achieving this “commitment” lies within the team.

  • Evolution of requirements

New requirements on the project are identified based on the latest trends in the market. Various tools for automation and continuous build and integration support development processes and help control regression.

  • Shorter Delivery cycles

Due to short delivery cycles of the project, advantage of ‘Time to Market’ is available to the stakeholders. Moreover, most of the risks are identified and mitigated at quite an early stage of the project.

Agile methods usually follow the given processes:

  • Task break ups

Team picks up tasks based on priority defined by the stakeholders. The task is broken up into smaller units and each unit is estimated. The estimate includes analysis, design and approach, unit testing and acceptable criteria. Once the task estimation is done and the delivery date identified, it’s the Team’s responsibility to stick to the commitment.

  • Iterations

Usual iteration cycles for delivery range from 2 weeks to 4 weeks with usually shorter timelines preferable. Single Iteration involves a complete SDLC right from planning, analysis, design, development and testing. A feature may involve multiple iterations before it is production quality.

  • Team

Team composition in agile projects is cross-functional. The team takes up the responsibility of required functionality during delivery of tasks in the iteration. Usual size of team is 4 to 9 people to promote team communication and collaboration. Agile emphasizes on face to face communication between team members, however in case of distributed teams, communication methods like video-conferencing, voice, email, IM etc are suggested.

  • Progressive product maturity

Usually any feature or product involves multiple iterations before it is ready to be deployed into production.  Agile encourages usages of tools and techniques like early and continuous integration, team estimation, test driven design and development, code refactoring etc to improve project quality.

Outcome of “Going Agile”

  • Early ROI for stakeholders on project with all round early risk mitigation.
  • Provide a competitive edge and adaptability to new and additional requirements during project development.
  • According to the survey from Dr Dobbs Journal, Agile methodology provides far better productivity compared to other methodologies.
  • Self Organizing team with better collaboration and communication.

In summary, Agile is a Win-Win for all the parties involved in the development and execution of the project.

Well-known agile methods are:

  • DSDM
  • SCRUM
  • Extreme Programming (XP)

Look for more in this space on DSDM, Scrum and Extreme Programming.


Anand Ved
Anand Ved– Technical Lead – Cloud Computing Project

, , ,

02
Jul

This blog deals with integration of MS Project with an Enterprise product

At a very basic level, MS Project, or MSP, is a project management tool, used by several organizations across various sectors to track and manage projects.

As a real-life example of the value that MSP can add to a product’s development phase when properly integrated with the development cycle, an interesting article is given next.

In a particular project, the client had developed an enterprise product to track projects, which were part of Six Sigma initiatives within the organization. Over the years, this product was transformed into a tool, which could be used to manage projects within multiple initiatives such as IT, new product development etc.

This change in the product’s roadmap created a need to integrate the product with MSP. A primary reason for this was that a lot of customers were of the view that their managers were quite comfortable using MSP and wanted to continue using it as a scheduling tool.

Plug-in has to be installed on each machine where they want to use MSP and integrate with product managed under MSP. This plug-in will then synchronize with the MSP APIs to set and retrieve data.

To develop the plug-in, the following steps were taken:

1)    First, the language in which plug-in was to be developed was decided upon. A Microsoft language was preferred, since it would provide in-built objects to access MSP APIs. In this case, Visual Basic was selected since it has fewer compatibility issues.

2)    The plug-in would need to primarily perform two operations:

  1. Import Data into MSP: By reading the source data in a specific format (in this case, XML), the plug-in would need to set the data using the APIs provided by MSP. This includes creating resources and tasks, and setting dates, efforts, constraints, resource rates etc. This is a critical step since the sequence in which the values of the fields need to be set has to be understood. For example, every task has actual work and actual overtime work.  When these values need to be entered into MSP from a host system, both of them first need to be set to zero followed by setting the actual work and the overtime work values, in that sequence. The sequence was critical here, because otherwise, incorrect values would be entered in MSP.
  2. Export Data from MSP: This step would be needed to get the data from MSP and generate a file, in the format expected by the enterprise product. As compared to the data import process, this was a relatively easier step, although certain cases could require a connection to some web service exposed by the enterprise product followed by sending the data in a file, which is generally in the XML format. An alternative would be to create a file and then ask the user to import it into the system through the enterprise product.

3)    The next step was to create an executable installable for the plug-in.

4)    The similarities and differences between MSP and the product also had to be examined as it was critical to build the logic based on how the product and MSP worked. There could be no single solution, which would integrate any product with MSP as the latter is highly customizable and has to be configured keeping the product in mind.

5)    The version of MSP being used was also important since MSP 2007 has some advanced features, such as “Resource-Cost” as a type of Resource, which are not available in MSP 2003.

So as illustrated by the example above, the reader may now agree that integrating MSP with any enterprise product would certainly add a lot of value to the product. To be able to do this successfully though, one has to have a good knowledge of MSP and be aware of its various configurable parameters, only after which, a plug-in can be developed.


Ashish Nathani
Ashish Nathani– Project Manager

, ,

09
Jul

Almost everyone today is aware of internet based companies such as Yahoo!, MSN and Google. These are sites concurrently handling several million visitors from across the globe every hour. Have you ever wondered about what goes into designing such high volume websites? This blog discusses the factors that need to be kept in mind while designing such portals.

There are several aspects to architecting high traffic web portals, which are expected to serve high concurrency with high availability and without degrading the performance. Apart from the architecture, other SDLC phases such as design, development and deployment also need special considerations. Since architecting the system is the very first step towards building the portal, this post will highlight some of the important architectural considerations.

To withstand the heavy traffic, the system should primarily be scalable, be highly available and should be able to intelligently delegate/distribute the traffic to improve the overall performance. Each of these aspects is discussed in turn below.

Scalability

Scalability is about concurrency and expandability. In the current context, it is related more to servers which are serving the application. Higher the capacity of the server, more the traffic it can serve. There are two types of scaling with their own pros and cons and it is a judgmental decision to choose which one (or even both, in combination) to go for depending on the expected traffic.

Vertical Scaling vs. Horizontal Scaling

Vertical Scaling: Also known as scaling up. This means adding more hardware resources in terms of number of processors, memory etc. to the existing server to cope up with increasing traffic. The ease of implementation of this method also comes with some disadvantages, such as:

  • Continuous upgrading of server is expensive.
  • There is always a limit for a given server to upgrade to.
  • If the server crashes, the application is not available.

Horizontal Scaling: Also known as scaling out. In this approach, instead of adding hardware resources to the existing server, extra server machines (maybe with comparatively lower capacity) are added to the pool. All the servers serve the same application. This is a cheaper approach since individual servers need not have very high end configuration. Additionally, even if one server crashes, the others in the cluster will still continue to serve the application. The only drawback is that it requires more administrative efforts in terms of configuring and monitoring the cluster.

High Availability

Backup Server

In this configuration, two servers are deployed for the same application. The primary server serves the application and the second server acts as a backup for the primary. If the primary server goes down for some reason, the backup server takes care of the user requests. There are two configurations possible with this:

Active-Standby: where the standby server is passive while the primary server is active. In case the primary server goes down, the current user sessions are not maintained when the backup server takes over.

Active-Active: In this case, the user sessions are maintained and are continued to be served by the backup server when the primary goes down.

Clustering

For very high traffic, the clustered approach (Horizontal Scaling), which ensures high availability of the application, is effective. With the clustered environment, the user gets a seamless experience. This kind of environment can be configured to maintain the user’s web sessions in case any of the servers go down. Most of today’s application servers provide clustering as an inbuilt feature. Using proper load balancing mechanisms in place, one can have different servers of different capacity in the same cluster.

Clustering

Performance

Performance refers to how efficiently a site responds to browser requests according to predefined benchmarks. The application performance can be designed, tuned, and measured. As said earlier, it can also be affected by many complex factors, including application design and construction, database connectivity, network capacity and bandwidth, back office services (such as mail, proxy, and security services), and hardware server resources. In the scope of this post, below are some of the considerations for performance while architecting the system.

Load Balancers

In a clustered environment, it may be possible that all the servers are not of the same capacity in terms of CPU, RAM, etc. Software load balancers are available which can enforce a policy on the site while distributing the load across the servers. The simplest policy could be of a “round robin” type where requests are passed sequentially to all servers, thus utilizing cycles of each server in the cluster. Some of these tools also allow configuring the rules for individual servers on the basis of its CPU capacity, RAM or current load on the server. For example, servers having low capacity would serve comparatively lesser number of requests to maintain the performance benchmarks.

Delegating the Traffic

While loading any web page, a browser sends several HTTP requests to the server to download associated content such as images, CSS and JavaScript files, video files etc., which are required to be rendered on the page. It is possible to distribute the implicit requests for this content across different servers to allow the main application server to serve the dynamic contents of the main page. Several techniques can be adopted to achieve this, as discussed next.

  • Proxy Web Server

This is a commonly used technique where the web server acts as a proxy to the application server. All the static content (such as images, CSS, JavaScript and video files) used by the site is deployed and served by the web server. Only relevant requests are forwarded to the application server, which reduces the direct load on the application server. These web servers can also form a cluster in front of the application server cluster.

  • Use of CDN

A content delivery network or content distribution network (CDN) is a system of computers placed at various geographical locations in a network so as to maximize bandwidth for access to the data from clients throughout the network. All the servers in the network deploy and serve the same content. A client accesses a copy of the data nearest to it, as opposed to all clients accessing the same central server, so as to avoid a bottleneck near that server. These systems implement routing algorithms such that the nearest server serves the request for the fastest delivery.

Various vendors in the market provide this service with a high quality, low cost and low network load. A CDN can offer 100% availability, even with large power, network or hardware outages.

Content distribution network

  • Third Party Storage Services

This approach refers to using third party services to store the data on their servers. These services help in reducing the initial investments in infrastructure. The storage space can be bought on demand. Generally, these services are used to store contents uploaded by users.

There are services such as Amazon S3 which provide online storage through a simple web services interface at a very nominal cost to store and retrieve the contents.

Although this approach is generally useful in reducing the hardware cost, it can also help in performance improvements in this context. Since the contents are stored on third party servers and are also available as URI, the overall load on the main server is reduced to some extent.

Conclusion

In summary, some of the important architectural considerations for designing high traffic web sites and portals have been discussed in this blog. There are also several other factors at different phases of the design which need to be considered to achieve good concurrency and performance.


Ajit Mahajani
Ajit Mahajani– Technical Architect

06
Aug

Introduction

In today’s connected age, almost everyone knows what the Internet is. However, very few people have an idea about the workings of this worldwide network. The Internet, as we know it today, has had a long history of evolution and like any other interesting and useful invention, is governed by a set of rules and protocols.

This blog introduces the reader to the basics of the fundamental protocol behind the workings of the Internet – the Internet Protocol, and the current and upcoming versions of this protocol.

Internet Protocol

Internet Protocol (IP) is a set of rules used for data communication across a packet-switched network. It is also known as TCP/IP. The Internet Protocol is used as the mechanism for resolving host addresses and routing data packets from a source to a destination across one or more IP networks, such as the Internet.

The major design principle behind IP was that the network infrastructure cannot be relied upon for data delivery, but it is dynamic in terms of the number of available links and nodes between the source and destination. The network is supposed to be self-sustaining and there is no central monitoring agency to track the state of the network.

IPv4

The first version of Internet Protocol to be publicly deployed and widely used is IPv4. It was designed by the Internet Engineering Task Force (IETF) and came into force in September 1981. RFC791 (http://tools.ietf.org/html/rfc791) provides the technical details of IPv4.

Limitations of the Addressing Mechanism in IPv4

Any computer or computing device connected to the Internet has to be allocated a unique address for identifying it on the worldwide network. This address is called the IP address. This address serves two main purposes: 1 – identifying a host system or a network interface on the Internet, and 2 – providing a logical address for a software application running on that computer.

IPv4 uses 32 bits for computing this address. Using 32 bits provides a maximum of 232 or 4,294,967,296 (4 billion) unique addresses. Out of these 4 billion addresses, some are reserved for special purposes, such as private networks and for multicasting.

Also, due to the phenomenal growth of the Internet in the late 80s and early 90s, empty or unassigned IPv4 addresses have started decreasing and are estimated to exhaust before 2012. This anticipated shortage has been the driving factor behind the design of a new version of the Internet Protocol, IPv6.

IPv6

IPv6 was designed as a successor to IPv4 and was described by the IETF in a standards document RFC2460 (http://tools.ietf.org/html/rfc2460) published in December 1998. IPv6 uses a 128-bit addressing mechanism, resulting in 2128 (3.4 x 1038) addresses. This vast increase in the address space provides more flexibility in allocating addresses for devices as well as for routing data traffic.

Why not the name ‘IPv5’ after ‘IPv4’?

The designers of the next generation IP networks could not use the number ‘5’ as a successor to IPv4 because it had already been assigned to ‘Flow Oriented Streaming Protocol’, an experimental protocol intended to support the transmission of streaming audio and video.

Key Features and Advantages of IPv6 and Differences from IPv4

IPv6 uses a new header format which is significantly different from that of IPv4, making the two protocols not interoperable. IPv6 also specifies a new packet format, designed to minimize the computing overhead in processing the packet headers. Some of the key features of IPv6 are listed below.

  • Larger address space

The most important feature of IPv6 over IPv4 is that it provides a 128 bit address compared to a 32 bit one for IPv4, increasing the addressing capability by several trillions. An example of the addressing provided by IPv6 is shown below. (Source: link)

IPv6

IPv6

  • Stateless address autoconfiguration (SAA)

SAA is a technical term describing a mechanism in which IPv6 hosts can automatically configure themselves using predefined handshake signals and messages defined by ICMPv6 – a protocol designed to augment IPv6. ICMPv6 provides for several messages which are used for diagnostic and error reporting activities.

  • Multicasting

Multicasting is the ability of a host to send out a single packet to multiple destinations. IPv4 included multicasting as an optional feature (although it was most commonly implemented in all hardware and associated software) whereas IPv6 specifies multicasting as a mandatory feature.

  • Included network layer security

IPv6 includes a protocol for encryption and authentication – IPSec. This feature too was optional in IPv4 (although it was usually implemented) but made mandatory in IPv

  • Simplified processing by routers

To make the process of packet forwarding easier, a number of simplifications, the details of which are beyond the scope of this blog, were made to the IPv4 packet header forming the base for the IPv6 header. These changes were targeted at making data forwarding by routers simpler and hence more efficient.

  • Mobility

Mobile IPv6 (MIPv6) dropped the concept of triangular routing, in which a packet is sent to a proxy before being sent to the intended destination. Hence, mobile IPv6 is as efficient as normal IPv6. IPv6 routers may also support Network Mobility (defined by RFC3933 – http://tools.ietf.org/html/rfc3963) allowing an entire network subnet to move to a new router connection point without being renumbered. Theoretically, this would also increase the efficiency of data flow.

  • Extensibility of Options and ‘Jumbogram’ Support

Options in IPv6 are implemented as additional extension headers after the IPv6 header, providing the size of an entire packet for implementing an option. This is unlike IPv4, which limits the options parameters to a fixed size of 40 octets.

IPv4 also limits packets to 65,535 (216 – 1) octets. IPv6 supports packets over this limit, known as ‘jumbograms’ which can be as large as 4,294,967,295 (232 – 1) octets.

IPv6 Deployment

Although IPv4 addresses are slowly getting depleted, their exhaustion has been significantly slowed down by the introduction of techniques such as classless inter-domain routing (CIDR) and the extensive use of network address translation (NAT). It has been forecasted that IPv4 addresses will be completely depleted sometime between 2011 and 2012.

IPv6 still accounts for a very small fraction of the addresses used in the IPv4 dominated public Internet.

The biggest deployment area for IPv6 is cellular telephony, which is slowly being transitioned from 3G to 4G technologies where voice is seen as a Voice over Internet Protocol (VoIP) service, mandating the use of IPv6.

Hurdles to the Adoption to IPv6

The spread of IPv6 usage has been slow due to several barriers to its availability and adoption.

The biggest barrier to IPv6 adoption is its compatibility with legacy devices. Problems with such devices include manufacturers who either no longer exist, or make updates prohibitively expensive. Problems with these devices also include mask programmed firmware (ROM) which is impossible to upgrade as well as limitations of the device to support the IPv6 protocol stack.

Newer equipment supporting IPv6 include newer and improved hardware, which also increases the cost of the device to the end customer and makes software development for such devices expensive.

Lastly, a significant hurdle to the spread of IPv6 is the consumer’s lack of interest, since IPv4 is proving to be sufficient for all their needs right now.

In summary, we saw in this blog the advantages that IPv6 promises to bring for tomorrow’s connected world. It only remains to be seen when IPv6 becomes as common as the Internet itself is, today.



Amit Sheth
Amit Sheth– Sr. Technical Writer

14
Sep

Overview

Back in 2001 when a group of 17 CTO’s came together to discuss an alternative to documentation driven, heavyweight software development processes, Agile was born. Agile is a software development technique based on iterative and incremental development, in which requirements and their solutions continuously evolve via collaboration amongst self-organizing, cross-functional teams. The Agile Manifesto, introduced in 2001, places more emphasis on individuals and interactions rather than processes and tools.

As it happens with any new concept, Agile was misinterpreted as not requiring any tools at all, or as focused only on inclusive modeling using simple tools such as whiteboards and paper. Yes, both whiteboard and paper would constitute one of the most basic tools in Agile and if used correctly, can be very effective for a small Agile team in a location. However, in today’s connected world, “Remote-Team” is, but a mere term. Agile or not, today teams/projects are connected to their counterparts in various parts of the world and it becomes mana-de-rigueur to use sophisticated tools to increase collaboration and transparency.

I had the experience of working in a Distributed Scrum team spread across the US and India, and our team extensively used efficient Agile tools to run the day to day activities of the project. However, this was not always the case. One major advantage we had on our side was that the stakeholders were Scrum evangelists and were focused on breaking the traditional waterfall mould. In this blog, I plan to discuss these phases and a couple of tools which kept our project ticking.

Prior to using tools for Scrum

When a new tool comes aboard a project, it does make life easier for all stakeholders, especially in a distributed Scrum. In the early days, it used to be quite cumbersome for both the US and Indian team members. The project was driven primarily from the US where a majority of the stakeholders, including the ScrumMaster, were present. The Indian team would then go for a “Sprint Planning Meeting” and break down the user stories which the ScrumMaster would then put up on the Scrum Board. The Indian team then picked up the stories and the ScrumMaster would move the cards around the board.

After the Daily Scrum Meeting, team members would inform the ScrumMaster of the remaining hours and he would update the burn down charts. Naturally, this couldn’t go on for long without people getting burned out in the process and eventually diving back head-on into the waterfall. So the ScrumMaster decided to introduce Scrum tools to help the sprints gallop at a good pace.

Enter ScrumWorks , CruiseControl and Perforce . Notably, it was also decided to follow Scrum during integration of these tools. A separate infrastructure team was set up with their own sprints to integrate these tools into the mainstream project.

ScrumWorks

ScrumWorks Pro is used for all day to day activities in a project. The tool is hosted on a server and hence, is accessible over the Internet. The PO defines the product backlog as per the priority and during the sprint planning meeting, the team picks up the stories and moves them to the sprint backlog.

A screenshot of the backlog window in ScrumWorks Pro is shown below.

Backlog window in ScrumWorks Pro

If you observe the above figure, the Product Backlog window fills the main screen once a Product has been opened or created. It allows you to edit, prioritize, and schedule your work.

Higher priority work appears higher on the list than lower priority work. You can reprioritize and schedule your work by dragging and dropping backlog items, tasks, and releases.

The left pane represents Sprints (typically 4-week iterations) in chronological order. The right pane represents “uncommitted” (unscheduled) work, optionally organized into “releases”. You can also use the “release” feature to group related backlog items.

The tabs on the left pane display the teams that are working on the open product. Once the sprint is planned, the team updates the remaining hours on the stories after the daily scrum meeting. This lets the stakeholders see the progress on the burn down chart during every sprint.

A sample screenshot of a sprint chart is shown below.

Sprint chart

Sprint chart

To highlight any impediments during the sprint, we used the Impediments window (shown below) to report all issues. The ScrumMaster tracks the impediments from this window.

ScrumWorks - Impediments

ScrumWorks - Impediments

Complete Setup

We had various dedicated environments for our project – development, integration, QA, staging and finally production. Our Enterprise version product was related to the Supply Chain Management domain with a wide user base. With teams distributed across the US and India, we required a Software Configuration Management System which could support this work model. Perforce supported our iterative model and fit into our system. To start with, it had an easy GUI, was very flexible and allowed us to create private branches from the main trunk when required. We could also run automated tests against a specific branch. The merging was handled by Perforce with color coded indication of the changes. It also seamlessly integrated with our Visual Studio IDE which made check-in/check-out very easy.

With Perforce in place as the Change Management System, we hooked up CruiseControl as our big melting pot Integration Server. CruiseControl for .Net, or CCNet as it is popularly known, is an automated integration server. The server automates the integration process by monitoring the Perforce source control repository directly. Once developers check-in any code in Perforce, CruiseControl picks up the newly modified files and its associated components and creates a new build on the Integration server. When the build is complete, the server notifies the developer whether the changes that they committed integrated successfully or not. Using automated integration guarantees that the integration build will be completed.

The quality team would then deploy the successful builds on their environments and run a full set of functional and regression tests before certifying it for Staging and Production.

These tools helped both the US and Indian team members to effectively manage themselves. Once the team started using ScrumWorks, they did not have to depend on their US counterparts to access the sprint backlog as well as update the progress during the sprints. The stakeholders could also review the project progress any time.

The combination of following Scrum and using compatible tools has made this project a great success!

[1] ScrumWorks is a registered trademark of Danube Technologies, Inc. [1] PERFORCE is a registered trademark of Perforce Software, Inc.

Sushil Sapre– Tech Lead of Xoriant’s Microsoft COE

23
Sep
This entry is part 1 of 5 in the series HTML5 Series

Welcome to the series on HTML5 development brought to you by the team at Xoriant.

HTML5 has been in the news of late and it promises to help deliver the next set of web applications with features that closely resemble some top of the line desktop applications.

This series will focus on demonstrating a feature of HTML5 with emphasis on a short working example. The idea is to introduce to you what your applications could gain by using HTML5 today. We will focus less on controversial stuff like Flash v/s HTML5 and instead focus on the features. As a developer, you are the best judge of using the technology in your applications today by looking at all factors. We will focus particular on bringing these experiences to mobile web applications, where the Webkit based browsers on most Smart phones today are more than capable of supporting several HTML5 features.

All you need is a text editor (or your favorite IDE), a Web Server (maybe) and a HTML5 compliant browser (more on that later).

Ready, Set … Lets Go!

Which Browser?

First, a word about the browser support for HTML5. To summarize in brief, HTML5 support has been announced in most browsers and all of them support core features, though several features present in the specification are still not implemented. With the announcement of IE9 Beta, HTML5 support has got a fillip in the IE world too, though Google Chrome presents your best bet at this point in time to try out some of the features.

Enter “The HTML5 Test”

A quick tip to figure out how well your browser supports HTML5 specification is to navigate to http://www.html5test.com. This will display a score about how well your browser supports HTML5. You should not worry too much about the score, rather there are 2 things that are particular important here and can actually act as a guide to your learning HTML5 and they are:

  • Check out all the features that are there in HTML5. Just a look at the number of features should get you excited about HTML5. Pick each of them up, refer to the web and learn about them. We will also be covering most of them in our series.
  • Once you have learnt about a particular feature and wish to see it work, check if your browser is supporting that feature of HTML5. If Yes – you are good to go. If not – you will need to wait.

A quick look at http://www.html5test.com in my browser (I am running Google Chrome 6.0.472.63 beta) shows that it does quite well.

html5test.com in Google Chrome browser

html5test.com in Google Chrome browser

Some assumptions

We will assume that our readers are familiar with developing basic web applications with HTML, JavaScript and a little bit of CSS. That will help you best to appreciate the new capabilities that HTML5 is bringing forth to the table.

What about Mobile Browsers?

The thing that excites us the most today is the fact that most Smart phones shipping today are equipped with the fantastic WebKit browser, which has great support for several HTML5 features. So if you have a Smart Phone, be it Android or the iPhone, you are all set. The other Smart Phone manufacturers including RIM and Nokia have also announced support for WebKit, so moving forward your application will be well placed to be functional on newer devices coming from them.

Where do we go from here?

Well, the question to ask should be “Where am I?” before we can go anywhere. And with that, we can look at the Geolocation APIs that are provided in HTML5. Geolocation is all about telling us where we are. In short, we will know the Latitude and Longitude, though it gives much more information than that like Altitude, Accuracy, etc.

To make sure that your browser supports Geolocation, point it to http://html5test.com and navigate below the Geolocation section, where you should see a score of 10 as shown below.

Browser supports Geolocation

In case there is no Geolocation support, which means the screen below:

Browser does not support Geolocation

Then I strongly suggest that you get either the latest version of Google Chrome or Firefox to see the example work. If not, you can still follow the article.

The code (geotest.html)

We will write a minimal piece of HTML code that is functional in nature and will access the Geolocation support in the browser and show the current Latitude and Longitude. First, take a look at the geotest.html file below:


<!DOCTYPE html>
<html>
<head>
<meta charset=”utf-8">
<title>HTML Geolocation Test</title>
<script type="text/javascript">

function findCurrentLocation(){
	var geoService = navigator.geolocation;
	if (geoService){
		geoService.getCurrentPosition(showCurrentLocation,errorHandler);
	} else {
		alert("Your Browser does not support Geolocation.");
	}
}

function showCurrentLocation(position){
	document.getElementById("mylocation").innerHTML = "Current Latitude : " + position.coords.latitude + " , Longitude : " + position.coords.longitude;
}

function errorHandler(error){
	  alert("Error while retrieving current position. Error code: " + error.code + ",Message: " + error.message);
}
</script>
</head>

<body>
<div id="main">
	<div id="mylocation"></div>
	<input type="button" value="Get Location" onclick="findCurrentLocation()"/>
</div>
</body>
</html>

Let us analyze the code which is rather straightforward:

1) Our body has a div element named mylocation, that we shall populate once we have the Latitude and Longitude.

2) We have a button, which when clicked calls the findCurrentLocation() method

3) The findCurrentLocation() method is where all the magic happens. The standard navigator object now has an object named geolocation. So, if your browser supports the geolocation object the navigator object will contain it. That is the first test we do. If there is no object then we display a standard alert informing the user that his/her browser does not support Geolocation.

4) If you do have support for the geolocation object, then all you need to do is invoke the getCurrentPosition method on it. The getCurrentPosition method takes 3 parameters:

  1. A success callback function
  2. An error callback function
  3. An optional parameters where you can provide some additional options to get the Current Position

5)  The success callback function in our case is the showCurrentPosition method. This method will get passed one parameter, an object of type Position. This object has two parameters, a property called timestamp and a property called coords that is an instance of type Coordinates. The coords object has several properties related to the current location like latitude, longtitude, altitude and several more. For our example, we are interested only in latitude and longitude.

6) The error callback function in our case is the errorHandler method. This method will get passed one parameter, an object of type PositionError. This object has two properties: code and message. The code is either 1 (PERMISSION_DENIED), 2(POSITION_UNAVAILABLE) or 3(TIMEOUT). The message is dependent on the device / machine browser.

7) We can leave out the 3rd optional parameter for now, since it is not fully supported on all browsers.

8)  So once the success callback function i.e. showCurrentPosition is invoked, we simply retrieve the latitude and longitude and populate the div with the results.

Our Geolocation Example in Action

We serve the geotest.html file via a local web server as shown below. Please note that I am using a latest version of Google Chrome browser.

Geolocation Example

Geolocation Example

When we click the Get Location button, we see that the browser prompts a warning message as shown below. This is important since it is recommended that all applications that deal with the location of the user should always allow the user to opt-in for sharing the location. It is good practice and strongly recommended, since it is a privacy issue and more users in the future will demand it.

Geolocation Example

Geolocation Example

If we click on Allow, you will find that it takes a little while (at times) to get the location and the current latitude and longitude are printed below.

Geolocation Example

Geolocation Example

Do note that since you allowed the browser to share the location, it has remembered that you have opted-in. On subsequent calls or even across browser restarts, this is remembered by the browser and you are no longer prompted by the browser for permission to determine your location. You can clear this preference, by clicking on the Icon shown below and then clicking on Clear these settings for future visits.
Geolocation Example

What happens, if we refuse permission to the browser to determine our location? Let us see what happens by running our example again. We navigate to the home page i.e. geotest.html and once again the browser prompts us, asking for our permission, as shown below:

Geolocation Example

This time we refuse it by clicking on Deny. This will results in the error callback function getting invoked i.e. errorHandler and we extract out the PositionError.code i.e. value of 1 (PERMISSION_DENIED) and PositionError.message that is device specific. In the case of Google Chrome, it rightly populates it to “User denied Geolocation” as shown below.

Do not assume that this message will be populated correctly. The same error message on Firefox browser gave the following alert message, as shown below:

So remember to provide a user friendly message that is populated by your application rather than depending on the device to do that.

What if the Browser does not support Geolocation?

If the browser does not support Geolocation, we have handled the scenario such that it displays an alert to the user in our test code. For e.g. I accessed our sample page via Internet Explorer 9 Beta and found that it displayed the appropriate message as shown below:

Remember HTML5Test that we covered earlier in the article. If we access http://www.html5test.com in Internet Explorer 9 Beta, we get the correct results for Geolocation feature as shown below.

So remember to save yourself enough heartburn and use html5test.com as a quick way to ascertain if a particular feature is supported or not.

What can we do with our location?

Location is the starting point of an entirely new class of applications/services that called themselves “Location Based Services”. Once you know the Location of the user, several value added features can be built on top of it. Location of a user is such a hot topic nowadays that everyone from Google, Yahoo, Twitter and Facebook added location of the user activity to their applications. Examples of how you can use a location are:

· Weather Reports for a particular location as the user is passing through it

· Traffic incidents as they happen

· Retail offers from merchants in the vicinity of the location that the user is currently in

· A Friend Locator application that informs the user of his/her friends if they are within close proximity of each other.

Most of these LBS application also make use of Google Maps to provide an intuitive user interface.

Conclusion

We saw how easy it was to get access to Location using the Geolocation APIs in HTML5. Geolocation data when combined with Maps and Other services can help create powerful applications.

Please participate

How important is Location to your application? How did you incorporate it into your application? What do you think of HTML5 Geolocation APIs? Would it make things simpler/better? Please discuss in the comments below. We look forward to your active participation.

Romin Irani
Romin Irani– Principal Architect

26
Sep
This entry is part 2 of 5 in the series HTML5 Series

In the first part of our series, we saw how to use the Geolocation feature of HTML5. While this part will not focus on any specific feature of HTML5, it will bring to the table an interesting way in which you can determine if your browser supports a particular HTML5 feature or not.

To understand that a little bit, let us go back to the code in the first part in which we determine if the browser has Geolocation support or not. Take a look at the code snippet below that is reproduced from the first part:


function findCurrentLocation(){
	var geoService = navigator.geolocation;
	if (geoService){
		geoService.getCurrentPosition(showCurrentLocation,errorHandler);
	} else {
		alert("Your Browser does not support Geolocation.");
	}
}

The above Javascript function findCurrentLocation() is interested in determining first if there is Geolocation support or not. If you notice, it queries the standard navigator object first. If your browser support Geolocation, then it will have an object geolocation. We then put in an if statement that does something if support is there and if there is no support then it falls back to another path of execution.

While this approach is fair enough, you will soon end up with similar pieces of code to determine if a certain HTML5 feature exists or not. In certain situations you may also do browser sniffing (User Agent detection) to include a certain piece of code if support exists or not. This soon becomes unwieldy and clutters up your code too. You need a cleaner mechanism of detecting support for HTML5 features (since we still in the transition phase where all browsers support most features).

Enter Modernizr. To quote from their web site, Modernizr is a small and simple JavaScript library that helps you take advantage of emerging web technologies (CSS3, HTML 5) while still maintaining a fine level of control over older browsers that may not yet support these new technologies.

And how does it do that? Modernizr creates a global JavaScript object (called Modernizr) which contains properties for each feature. If your browser supports it, it will evaluate to true, else it will be false.

Modernizr is useful for all aspects of HTML5. It not only helps you detect if your browser supports new HTML5 markup elements, CSS3 features but also the Javascript APIs (Geolocation, Storage, Web Workers, etc). Keep in mind that it does not enable any of the features, it is simply there to help you detect it.

You can refer to its complete documentation for all the features that it can help detect for HTML5. I provide a brief summary below for certain HTML5 features and we will see how to modify our function above to incorporate Modernizr.

HTML5 Feature Modernizr Object
Geolocation
Modernizr.geolocation
Local Storage Modernizr.localstorage
Session Storage
Modernizr.sessionstorage
if (Modernizr.<feature>) {
//Yes! Your browser supports the feature, so take advantage of it
}
else {
//No! Your browser does not support the feature, so fall back on some other mechanism. Degrade gracefully.
}

Using Modernizr in your code
We shall modify our existing Geolocation code shown at the start of the article and which we covered previously in the series and employ Modernizr.

Follow these steps to include Modernizr:

  1. Download modernizr-1.5.min.js from the main web site
  2. Include the script in your code. This will initialize Modernizr when the page loads. In fact, it creates the global Modernizr JavaScript object with all the feature objects created, so that you can start using it in your code.
  3. Finally, use the template pattern in your code as shown above.

I will recreate the geotest.html example that we covered in Part 1: Geolocation article.

<!DOCTYPE html>
<html>
<head>
<meta charset=”utf-8">
<title>HTML Geolocation Test</title>
<script src="js/modernizr-1.5.min.js"></script>
<script type="text/javascript">

if (Modernizr.geolocation) {
  navigator.geolocation.getCurrentPosition(showCurrentLocation,errorHandler);
}
else {
  alert("Your Browser does not support GeoLocation.");
}
… REST OF THE CODE
</script>
</head>

<body>
<div id="main">
	<div id="mylocation"></div>
	<input type="button" value="Get Location" onclick="findCurrentLocation()"/>
</div>

If you look at the above code, we have simply included the script file i.e. js/modernizr-1.5.min.js and then checked if the Geolocation feature is supported. The rest of the code remains straightforward.

While this is a simple example, the utility of Modernizr becomes visible when you deal with many other features of HTML5. Let it do all the hard work of determining a particular feature of HTML5 in your browser.

Please participate

We look forward to your feedback on this article and the others in the series. Let us know which feature you would like to see covered. Additionally, do share your experiences with HTML5.

Romin Irani
Romin Irani– Principal Architect

28
Sep
This entry is part 3 of 5 in the series HTML5 Series

Hope you have been enjoying the series on HTML5 so far, where we covered HTML5 GeoLocation and the Javascript library Modernizr. In this part of the series, we shall cover the new HTML Form elements that have been introduced in HTML5.

A Form is one of the most basic and essential features of any web site. The form elements available in HTML so far include the textbox, checkbox, radio, button, drop-down list, password and file picker. While these have sufficed so far, there is a clear need for newer form elements. The question is not just of newer form elements, but the ability to inject behavior into existing form elements so that usability and validity, which is a cornerstone of any good UI, is given highest consideration.

We shall break up HTML5 Form features into 2 parts in this series. This part will be more focused on the different types of input elements that are introduced in HTML5. In the next part, we shall look at how additional attributes introduced in HTML5 bring in significant improvements to both usability and also to the development code.

New Input Types

HTML5 brings to the table several new input types, a total of 13. The philosophy behind these new input types is to address the common types of data fields that users typically have to fill up in a web form. Apart from plain text, there is data to be filled up like email address, web site URLs, Phone Numbers, Date/Time, Numeric data, Colors, etc.

HTML5 introduces these data types via the <input type=”_NEW_TYPE_HERE_”/> format.

What if my browser does not support these new HTML5 form elements?

No problem. One of the key design decisions in HTML5 is backward compatibility. What this means is that if the new input types are not supported, then by default it falls back to <input type=”text”…./>, so it will be rendered as a plain text box, which the user can then fill data in.

One may ask, what is the advantage of these new input types and how are the browsers supposed to implement them? By supporting the new input types, two things happen:

a) You get automatic validity of the fields as per the format. This means that the form is not going to get submitted if the value entered is not as per the default validation of that type

b) The browser inspects the input type and if it finds that it is of a specific type, then it does something quite clear to aid the input of that data. For e.g. On the Smart Phones, which do not have a physical keyboard but instead a virtual keyboard, the keyboard that will be shown up will only contain keys that will aid the user in filling out the data.

Let us look at the new HTML5 input types that are present.

Email Address

This input type allows for entry of email addresses. The format is shown below:

<input type="email"/>

This input type is useful for entering email ids like test@test.com

Web URLs

<input type="url"/>

This input type is useful for entering web urls like http://www.xoriant.com

Tel

This input type is useful for entering telephone numbers

<input type="tel"/>

The email, url and tel input types are meant to instruct to the browser that the input would be valid only if it contains characters that validate the input type. Additionally, Smart Phone browsers like Apple iPhone do a very neat trick when they encounter input types like URL, Email and Telephone. Since they have a virtual keyboard, they will only show those keys that aid in entering the value quickly. So if you are trying to type the Telephone, it will only show the numbers and some characters, the other keys are hidden from you.

Shown below is a screenshot of the Android Web browser accessing a form which contains an input element of type tel. Notice that in the virtual keyboard, it only shows keys that aid in faster and less error prone data entry.

Android Web browser accessing a form Date Time

HTML5 provides for 6 types of DateTime inputs, which cater to a variety of date time entries, as your application may require. They are date, time, datetime, local-datetime, month, year. The syntax is shown below:

<input type="datetime"/>
  <input type="date"/>
  <input type="time"/>
  <input type="month"/>
  <input type="week"/>
  <input type="local-datetime"/>

Shown below is the datetime input type rendered in the Opera Browser:

Datetime input type rendered in the Opera Browser

Datetime input type rendered in Opera Browser

Spin Box

A frequent requirement in forms is to fill out numeric values. An input type number has been added. This will render the input as a spinbox if the browser supports it. It will even honor the min, max values that you specify. The step attribute is used to indicate the increment/decrement if the user clicks the up/ down spin button. The value attribute specifies an initial value for the control.


<input type="number"
       min="1"
       max="5"
       step="1"
       value="3">

Slider

Spin boxes are not the only way to enter numeric input. You can also allow a user to use a slider to specify a value. The syntax is shown below:

<input type="range"
       min="1"
       max="5"
       step="1"
       value="3">

If the browser supports the number and range types, it will render as shown below:

Browser supports the number & range types

Browser supports the number & range types

Search

This input is specified by giving the type=”search” as shown below.

<form>
  <input name="search_terms" type="search">
  <input type="submit" value="Go">
</form>

It will be rendered as shown below:

As you type, a cross icon will appear at the end of the input box and you can clear the search term by clicking on it.

autofocus, placeholder and novalidation

There are 3 attributes that we can apply to HTML5 forms that aid in data entry.

  • autofocus: This attribute when applied to any form element, will result in the field receiving focus. For e.g. consider the form shown below:

<form>
	<label for=”firstname”>First Name</label>
	<input type=”text” id=”firstname” name=”firstname” autofocus>
              <label for=”lastname”>Last Name</label>
	<input type=”text” id=”lastname” name=”lastname”>
	<input type=”submit” label=”Go”>
</form>

We have added the attribute autofocus to the firstname input field. When the form loads, you will find that the focus is already set on that field, thereby making it easier for the user to start filling the form.

  • placeholder: This attribute when applied to any form element, will display helper text to aid the user to fill up the value. When you give focus to the element, the placeholder value will go away and let the user enter the value. If no value is entered and you move away to another field, then the placeholder value will be shown again. Consider the same example as above but with placeholder values for both firstname and lastname fields.

<form>
	<label for=”firstname”>First Name</label>
	<input type=”text” id=”firstname” name=”firstname” placeholder=”Enter First Name here” autofocus>
              <label for=”lastname”>Last Name</label>
	<input type=”text” id=”lastname” name=”lastname” placeholder=”Enter Last Name here” >
	<input type=”submit” label=”Go”>
</form>

When we visit this page, we see the following form:

Placeholder rendered

You will notice that we had put placeholders for both the fields, but since the firstname field also had the autofocus, the placeholder text is not shown. Instead it has the focus so that the user can start entering the text. The lastname field has the placeholder text as shown. As you tab in and out the First Name field without entering any data, you will find the placeholder text returning back there.

Do note that not all browsers support placeholder text. The above support was from Google Chrome browser.
  • novalidation: By default, when you submit a form, all the fields in the form will be validated. This means that the browser that fully supports the HTML5 input types, will validate all the fields as per the input types and if they are not valid, the form is not submitted. Browser support varies over here so be careful. By default, all the forms are validated before submission. However, you can opt for the form to not be validated, by adding the novalidation tag to the <form> tag as shown below:
<form …. novalidation>

Conclusion

The new HTML5 input types and several attributes introduced in the specification are a clear step to address data input via forms, which forms a cornerstone of any public web application. Do note that there are many more features like pattern, etc which we have not discussed in this blog post, but the reader is encouraged to go and refer to documentation on the standard. While testing out any of these features, it is important to first determine if your browser supports them. In our tests, we have found the Opera browser to be one of the first to implement the different input types, however latest versions of Chrome and Firefox Beta are not far behind. We believe that the browsers will catch up soon and implement most of them.

Please participate

We look forward to your feedback on this article and the others in the series. Let us know which feature you would like to see covered. Additionally, do share your experiences with HTML5.

Romin Irani
Romin Irani– Principal Architect

04
Oct
This entry is part 4 of 5 in the series HTML5 Series

Welcome to Part IV of this series. We covered the basics of HTML5 Geolocation support in Part I of this series. In that article, we looked at how an HTML5 based web application can make use of Geolocation support built in the browser.

To recap, I will recreate the Javascript code for you below:


<script type="text/javascript">

function findCurrentLocation(){
	var geoService = navigator.geolocation;
	if (geoService){
		geoService.getCurrentPosition(showCurrentLocation,errorHandler);
	} else {
		alert("Your Browser does not support Geolocation.");
	}
}

function showCurrentLocation(position){
	document.getElementById("mylocation").innerHTML = "Current Latitude : " + position.coords.latitude + " , Longitude : " + position.coords.longitude;
}

function errorHandler(error){
	  alert("Error while retrieving current position. Error code: " + error.code + ",Message: " + error.message);
}
</script>

In the above code, we used the getCurrentPosition method on the navigator.geolocation object to get the current location co-ordinates. The main point to note about this method is that it is just a onetime call to get the location co-ordinates. The application is then in control to provide location relevant data. So while this might suffice for certain kinds of applications, it may not be enough for an application that needs the location regularly at a certain interval. For e.g. if you wish to write an application that would provide different location based data, if the user co-ordinates are changing i.e. if the user is moving.

To do that, you need to substitute the getCurrentPosition call with watchPosition. This method functions in a similar manner to getCurrentPosition, just that your application will get notified when the users position changes. In addition to getting notified i.e. the success function getting called when the location has changed, the watchPosition method also returns a unique ID. This ID is used together with a paired method called clearWatch. So whenever you wish to stop tracking the users location (which you should by the way, since these operations are draining), you can use the ID and pass it as a parameter to the navigator.geolocation.clearWatch method. This will stop the tracking of the user’s location.

Let us take a look at our modified code then continuously gets notified when the location of the user changes. We are simply changing the code that we covered here in Part I of the series.

Do note that I have decided to use the Modernizr Javascript library for detecting HTML5 support that we covered in Part II of the series.

The geotest.html file is shown below:

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>HTML GeoLocation Test</title>
<script src="js/modernizr-1.5.min.js"></script>
<script type="text/javascript">
var watchPositionID=0;

function startLocationTracking(){
	alert("startLocationTracking");
	if (Modernizr.geolocation) {
		alert("Your Browser supports GeoLocation");
		watchPositionID = navigator.geolocation.watchPosition(showCurrentLocation,errorHandler,{enableHighAccuracy: true});
	} else {
		alert("Your Browser does not support GeoLocation.");
	}
}

function showCurrentLocation(position){
	document.getElementById("mylocation").innerHTML = "Current Latitude : " + position.coords.latitude + " , Longitude : " + position.coords.longitude;
}

function errorHandler(error){
	  alert("Error while retrieving current position. Error code: " + error.code + ",Message: " + error.message);
}

function stopLocationTracking(){
	if (watchPositionID > 0) {
		navigator.geolocation.clearWatch(watchPositionID);
		alert("Stopped Tracking Location");
	}
}

</script>
</head>
<body>
<div id="main">
	<div id="mylocation"></div>
	<input type="button" value="Start" onclick="startLocationTracking()"/>
	<input type="button" value="Stop" onclick="stopLocationTracking()"/>
</div>
</body>
</html>

Let us go through the code in brief:

  • Our HTML page is simple. It has two buttons, one to start location tracking and the other to stop location tracking.
  • The Start button, invokes the startLocationTracking method. This method determines first if Geolocation support is present. If yes, it will invoke the watchPosition() method on the navigator.geolocation object. The first parameter to the watchPosition() method is the success function that gets invoked if the location is determined. That success function i.e. showCurrentLocation simply displays the location in terms of latitude and longitude.
  • The watchPosition method returns an ID that we are saving in the global variable named watchPositionID.
  • The Stop button, invokes the stopLocationTracking method. This method simply uses the watchPositionID and passes that as a parameter to the navigator.geolocation.clearWatch(…) method. This will stop the invocation of the success function i.e. showCurrentLocation if the users location changes.

Conclusion

We have seen in this article how we can use the navigator.geolocation object to keep tracking a user’s location as it changes. In an earlier part of the series on HTML5 Geolocation, we had covered getting the location on demand or only once. As per the requirements of your application, you can use any of these two mechanisms. Do keep in mind, that determining the location of the device is quite a drain on the battery, especially on mobile devices. So if you are using the watchPosition method to keep a track as the location changes, be sensitive to the battery consumption too.

Please participate

We look forward to your feedback on this article and the others in the series. Let us know which feature you would like to see covered. Additionally, do share your experiences with HTML5.

Romin Irani
Romin Irani– Principal Architect

13
Oct
This entry is part 5 of 5 in the series HTML5 Series

HTML in itself is primarily about designing and presenting content on the Web. Attempts to add new features to HTML and other related APIs have taken the Web environment to a different level altogether.

In this blog, we are going to look into designing features made available in HTML5. Talking about designing in HTML pages, the element canvas takes prime importance.

What is Canvas?

The new canvas element provides scripts with a resolution-dependent bitmap canvas, which can be used for rendering graphs, game graphics, or other visual images on the fly. The canvas API supports the same two-dimensional drawing operations that most modern operating systems and frameworks support. Furthermore, once canvas is added in an HTML page using the standard <canvas></canvas> notation, we can manipulate it with JavaScript. But to actually do something interesting, we can use JavaScript and get the context of canvas.

Before canvas, to have any drawing on the page, the only option we had were images (jpeg or gif), Flash, applets or some JavaScript hacks.

Canvas Co-ordinates

Like any other graphics system, canvas has a coordinates system with X=0 and Y=0 at the upper-left corner, as shown below.

Canvas Co-ordinatesSupport

Currently, canvas is supported by all major browsers such as Firefox, Chrome, Safari, Opera etc. The recently launched IE9 beta has also incorporated support for canvas.

Note: Before we start, we assume here that the reader has a basic understanding of HTML and JavaScript.

Getting Started

Let’s get your favorite editor and start typing!

In its simplest form, a canvas element can be added in an HTML page using “<canvas></canvas>”. But to do something interesting, we will add some more attributes as well.

<canvas id="myCanvas" width="150" height="150"></canvas>

First things first: For any element, it is a good practice to have some “fallback content” just in case the element is not supported in the browser.

<canvas>Use updated browser to see new features of HTML5</ canvas >

Here is a simple piece of code on which we can build our page:

<html>
<head>
<script>
	function draw() {
		var canvas = document.getElementById("myCanvas ");
		var ctx = canvas.getContext("2d");
	}
</script>
</head>
<body onload="draw();">
<canvas id=" myCanvas " width="150" height="150">
This example requires a browser that supports HTML5.
</canvas>
</body>
</html>

The above code will create a blank rectangular area. We can use JavaScript to add whatever shapes or patterns we need to add within this area.

Going Line by Line

var canvas = document.getElementById("myCanvas");

This line obtains a reference to the canvas element so that we can use it further in our code.

var ctx = canvas.getContext("2d");

Canvas has the getContext() method which provides a reference through which we can actually draw shapes.

The “2D” parameter provides a 2D context that represents a flat Cartesian surface whose origin (0,0) is at the top left corner, with the coordinate space having x values increasing when going right, and y values increasing when going down. 2D context provides a variety of different methods for drawing shapes, text APIs etc.

Trying out a Shape

Here’s a sample piece of code to draw two overlapping rectangles of different colors.

function draw() {
	var canvas = document.getElementById("myCanvas ");
	var ctx = canvas.getContext("2d");

	ctx.fillStyle = "rgb(200,0,0)";
	ctx.fillRect (10, 10, 55, 50);		// Draws a filled rectangle
	ctx.fillStyle = "rgba(0, 0, 200, 0.5)";
	ctx.fillRect (30, 30, 55, 50);
}

The result of the above code should look similar to the image below.

Overlapping rectangles

Drawing Paths

There are some additional steps that need to be taken while drawing paths: beginPath(), closePath(), stroke(), fill().

  • Create a path by calling the beginPath() method. Internally, paths are stored as a list of sub-paths (lines, arcs, etc.) which together form a shape. Every time this method is called, the list is reset and we can start drawing new shapes.
  • Next, call the methods that actually specify the paths to be drawn. This could be line or arc or curves.
  • It is optional to call the closePath() method. This method tries to close the shape by drawing a straight line from the current point to the start.

Finally, we can call the stroke() and/or fill() methods. Calling one of these will actually draw the shape to the canvas. stroke() is used to draw an outlined shape, while fill() is used to paint a solid shape.

function draw() {
		  var canvas = document.getElementById("canvas");
		  var ctx = canvas.getContext("2d");

		 ctx.beginPath();
		 ctx.moveTo(125,125);
		 ctx.lineTo(125,45);
		 ctx.lineTo(45,125);
		 ctx.stroke();

		 ctx.beginPath();
		 ctx.arc(75,75,50,0,Math.PI*2,true); // Outer circle
		 ctx.moveTo(110,75);
		 ctx.arc(75,75,35,0,Math.PI,false);   // Mouth (clockwise)
		 ctx.moveTo(65,65);
		 ctx.arc(60,65,5,0,Math.PI*2,true);  // Left eye
		 ctx.moveTo(95,65);
		 ctx.arc(90,65,5,0,Math.PI*2,true);  // Right eye
		 ctx.stroke();
    }

The above code should produce results similar to the image below.

Draw Image

The functions used in the sample code above are described in brief below:

moveTo(x, y) – The moveTo() function takes two arguments – x and y, – which are the coordinates of the new starting point.

  • lineTo(x, y) – This method takes two arguments – x and y, – which are the coordinates of the line’s end point. The starting point is dependent on previous drawn paths, whereas the end point of the previous path is the starting point for the following path etc. The starting point can also be changed by using the moveTo() method.
  • arc(x, y, radius, startAngle, endAngle, anticlockwise) – This method takes five parameters: x and y are the coordinates of the circle’s center. Radius is self-explanatory. The startAngle and endAngle parameters define the start and end points of the arc in radians. The starting and closing angle are measured from the x axis. The anticlockwise parameter is a Boolean value which when true draws the arc anticlockwise, otherwise in a clockwise direction.

Using Images

We can have images in canvas for dynamic photo compositing or for use as backdrops of graphs etc. But it is a little tricky to add images in canvas. There three ways we can use images:

  • Using images which are on the same page – We can access all images on a page by using either the document.images collection, the document.getElementsByTagName() method, or if we know the ID attribute of the image, the document.getElementById() method.
  • Using other canvas elements – Just as with normal images we access other canvas elements using either the document.getElementsByTagName() method or the document.getElementById() method. Make sure you’ve drawn something to the source canvas before using it in your target canvas.
  • Creating an image from scratch – Another option is to create a new image objects in our script:
var img = new Image();   // Create new Image object
img.src = 'myImage.png'; // Set source path
var img = new Image();   // Create new Image object
img.onload = function(){
// execute drawImage statements here
}
img.src = 'myImage.png'; // Set source path
function draw() {
	var ctx = document.getElementById('canvas').getContext('2d');
	var img = new Image();
	img.onload = function(){
		ctx.drawImage(img,0,0);
		ctx.beginPath();
		ctx.moveTo(30,96);
		ctx.lineTo(70,66);
		ctx.lineTo(103,76);
		ctx.lineTo(170,15);
		ctx.stroke();
	}
	img.src = 'images/backdrop.png';
}

drawImage(image, x, y) – image is a reference to an image or canvas object and x, y form the coordinates where the image should be placed.

Pixel Data and Canvas

Remember that the canvas API is based on pixels. This gives us the ability to easily access pixel information from canvas and manipulate it.

ctx.getImageData(startX, startY, width, height)

This function returns a representation of the current state of the canvas display as a collection of integers. Specifically, it returns an object containing three properties:

  • width: The number of pixels in each row of the pixel data
  • height: The number of pixels in each column of the pixel data
  • data: A one-dimensional array containing the actual RGBA values for each pixel retrieved from the canvas

RGBA – is Red, Blue, Green and A is an alpha component with values ranging from 0-255.

ctx.putImageData(imagedata, dx, dy)

This function can be used to update the canvas display. As you have access to the object with image data (using getImageData()), now you can easily modify the pixel values in the data array mathematically, because they are each simply integers from 0 to 255.

ctx.createImageData(sw, sh)

This method can be used if you want to create a new image from scratch using canvas. This set of data can be programmatically changed as before, even though it does not represent the current state of the canvas when retrieved.

Canvas Security

There could be security issues with respect to using pixel manipulation in canvas. For this reason, the concept of an origin-clean canvas was specified, so that canvases that are tainted with images from origins other than the source of the containing page cannot have their data retrieved.

Any canvas that contains images rendered from remote origins will throw a security exception if the getImageData() function is called. It is acceptable to render remote images into a canvas from another origin as long as you (or any other scriptwriter) do not attempt to fetch the data from that canvas after it has been tainted.

So as we have seen in this blog, the canvas API provides a powerful way to draw on HTML pages with images, gradients and paths. When properly planned, canvas applications can range from simple charts to complex visual data representing tool to cool animations and more!


Saurabh Akshekar– Sr. Software Engineer

29
Oct
This entry is part 6 of 5 in the series HTML5 Series

What are we going to talk about in this blog?

In this blog, I shall introduce you to the power of WebSockets – a new technology providing bidirectional, full-duplex communication channels over a single TCP socket, solving most of the issues faced in real time communication.

What’s in it for me?

I have heard a lot of people assuming that HTML5 just deals with UI components and has nothing for programmers. This article puts this apprehension to rest.

Is it really worth it?

Real-time applications such as stock trading, banking, financial applications, online gaming, gambling, etc. need very reliable and high performance systems. Imagine you playing an online poker game on Facebook and waiting for the other player to finish their turn. It is, at times, frustrating just to wait for your turn as the communication is not fast enough and even though it is nearly happening in real time, you can still feel the delay. The above example is not that critical, but imagine the delay in stock trading, banking, financial applications, currency exchanges, etc. To cite another example, the Indian currency exchange rate changes 10 to 15 times in a minute. Even a small delay could be crucial.

Anatomy of HTTP

HTTP, which is widely used, is only half duplex, i.e., the data can flow only in one direction at a given time, thereby making real time communication difficult. Also, each request and response has header information which adds to the network traffic. To find out the HTTP request and response headers, you can use an application like the Live HTTP Headers add-on in the Firefox browser and then see the results. There is a lot of unnecessary HTTP request and response header information overhead, at times, as much as 2000 bytes.

But I already have applications running on the Web! Why use WebSockets?

Some techniques used such as polling, long-polling and streaming have their own limitations:

Polling: This technique is used in Ajax applications to simulate real-time communication. I was really impressed when I had first used the Ajax concept and was happy that only the part of the Web page which needs to be refreshed is refreshed, and not the whole page. The XmlHTTPRequest was God’s own gift! But even Ajax needs us to keep polling to check whether the server has sent any response. This is achieved by the value of the readyState property of the XMLHttpRequest object:

If(http.readystate == 4) // meaning request is complete
{
	Call the function to process the data
}

Long polling: This is also known as asynchronous polling. The browser sends a request to the server and the servers keeps the request open for a long set period hoping that the process is completed within the period.  Thus the name long polling (but polling nevertheless).

Streaming: This is better than the above two techniques but faces possible complications when firewalls and proxies come into picture. The responses keep on getting built up and must be flushed periodically.

A step forward – beyond hacks

Enter HTML5 WebSocket to solve the above issues. It is a W3C API and IETF protocol, details of which can be found at http://dev.w3.org/html5/websockets/ or http://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-03.

WebSocket is a fully duplex single socket. It creates a channel between the browser and the server where two-way communication can be performed. The client (on the browser) does not poll whether the response has come or not. The client and server can send data to each other through WebSocket freely. It shares the port with the existing HTTP content. It easily traverses through firewalls, routers, proxies, etc. This is also known as Push technology. The server just pushes the data to the client and does not mandate the client to keep polling for data.

There are 2 schemes provided by WebSocket:

  • WebSocket scheme  - ws://www.websocket.org/text
  • WebSocket secure scheme – wss://www.websocket.org/encrypted-text

WebSocket Architecture

WebSocket Architecture

The browser clients communicate with the WebSocket server using the WS scheme and the WebSocket server communicates with the backend servers using TCP. There are various WebSocket servers available and the best of the lot is undoubtedly the Kaazing WebSocket Gateway. An evaluation copy of the same can be downloaded from www.kaazing.com.

There are some backend servers which understand the WebSocket scheme and can accept the requests directly from the browser clients. ActiveMQ 5.4.0 is one such server which I have come across. The WebSocket architecture diagram would then be as follows:

WebSocket connection uses the same TCP connection as it upgrades from the HTTP protocol to the WebSocket protocol. Once it is upgraded to the WebSocket protocol, then the data can be sent back and forth between the client and the server in full-duplex mode.

WebSocket data is sent in frames, where each frame of data starts with a 0×00 byte and ends with a 0xFF byte. In between the start and end bytes, it contains the UTF-8 data, for example:

\x00Hello, WebSocket World\0xff

There is no limit defined for the size of the data which can be sent as long as the user agent can manage it. Since JavaScript does not allow more than 4GB of data, so that is limit.

With WebSocket, each data packet (frame) has only 2 bytes of packaging and so the overhead is a minimum, when compared to the header information in HTTP request and response headers. There is no latency in establishing the new TCP connections for each HTTP message. Hence there is significant reduction in network traffic and latency.

There was a talk recently when W3C came out with a warning to go slow while embracing HTML5, but the good news is that all the important players such as Microsoft, Google, Mozilla and Apple are going ahead full steam supporting WebSocket in their browsers. The browsers which support the WebSocket as of now are:

  • Chrome 4.0 +
  • Firefox 4.0 Beta 1+
  • Safari 5.0 +
  • IE 9.0

There are two ways to check whether your browser supports WebSocket or not. The first way is to just go to www.websocket.org and on the top right side, you will get a message saying whether your browser supports WebSocket or not. The second way is to write a simple JavaScript code and check it yourself.

if (window.WebSocket)
{
alert("Yahoo !! Your browser does support WebSocket!");
}
else
{
alert("Sorry!! Your browser does not support WebSocket");
}

Now let us see how easy it is to use the WebSocket object.

var objWebSocket = new WebSocket(url);

The URL above is the address of the server to which you want to create the WebSocket connection to. For e.g., talk.google.com:5223 (the Google Talk server).

WebSocket programming is event based. Once the WebSocket is opened, we just wait for events and do not poll the server for data. There are 3 events associated with WebSockets and we need to code listeners for each one of them and associate the listeners with the events. The 3 events are open, message and close. The open event is fired when the WebSocket connection is opened successfully. The message event is fired when the server sends data. The close event is fired when the WebSocket connection is closed.

For e.g.:

objWebSocket.onopen = function(evt)
{
alert("WebSocket connection opened successfully");
};
objWebSocket.onmessage = function(evt)
{
alert("Message : " + evt.data);
};
objWebSocket .onclose = function(evt)
{
alert("WebSocket connection closed”);
};

Once the WebSocket connection is opened, the onMessage event is fired when the server sends the data to the client. If the client wants to send data to the server, it can do that easily as follows:

objWebSocket.send("Hello World");

With WebSocket, you can communicate with WebSocket servers and via that to any back-end servers or any message brokers. You can extend any TCP-based protocol to the Web like XMPP, Jabber, Publish/Subscribe protocols like Stomp & AMQP, gaming protocols like Darkstar, etc.

References:

Chirag Trivedi
Chirag Trivedi– Lead – HTML5 Group

18
Nov
This entry is part 1 of 4 in the series Securing Java Web Applications

Web applications are exposed to a number of threats all the time by their very nature of serving content to the public. There are intruders, hackers, impersonators, and eavesdroppers out there who try to wreck the contents you publish. This may harm your business in a serious way and therefore, it is important that you secure the contents of your Web application from such elements.

In this blog, I am going to explore some of the basic steps that can be taken to secure any Java-based Web application.

Placing the Files in the Right Directory

Java Web applications contain static HTML pages, JSP files, Servlet classes, Framework related classes (such as Action classes in case of Struts), Helper or Utility classes, image files, and any third-party jar files required by the application. Placing these files in the correct directories is the first step towards securing the contents of the application. The typical directory structure of any Java-based Web application deployed on a Tomcat Web server looks like this:

Java-based Web application

Only some of the parts of the application can be accessed directly using the URL. For example, in our case, the name of the application is TestApp which is also called a Webapp root directory. Anything that is placed directly inside the TestApp directory is accessible to the outside world, except the contents of the WEB-INF and META-INF directories. This stands true for any other directories we choose to create directly inside the TestApp directory to arrange our Web applications contents.

In our example, HTML files and JSP pages are placed directly inside the TestApp directory. Assume that a login.jsp file is also placed directly inside the TestApp directory.  This file can be accessed directly from outside the application using the URL http://www.mywebserver.com/TestApp/login.jsp. Further, assume that we choose to create a new directory called ‘auth’ inside the TestApp directory and place the login.jsp file in that directory. Will this new directory still be accessible from outside the application? Yes. In this case, the user just needs to point the browser to http://www.mywebserver.com/TestApp/auth/login.jsp.

This is not true for the resources placed inside WEB-INF and META-INF directories. Anything inside these two directories cannot be accessed from outside the Web server. The Sun Java Specification restricts any Web Server that supports Java based application deployment from directly serving the contents inside the WEB-INF and META-INF directories. If an attempt is made to access the resources inside these two directories, the Web Server in this case must respond with a “404 NOT FOUND” error to the user.

In our example above, if we move login.jsp from TestApp to WEB-INF and try to access it through the URL http://www.mywebserver.com/TestApp/WEB-INF/login.jsp, we will get a “404 NOT FOUND” error from the Web Server. Try this with the META-INF directory as well and you will get the same error.

So, the basic rule of thumb with any Java-based Web application is, place the resources you want to restrict access to in either the WEB-INF directory or META-INF directory. In other words, never put your confidential and valuable resources outside these two directories!

Disabling the Directory Listing Functionality

A Web Server is simply a place to publish the content that you want other people to access. You use the standard directory structure given above to arrange your application resources in the required and desired manner. You have also restricted the access to confidential resources by placing them in the appropriate directories as explained above. What will happen if someone is able to navigate through these directories and hack in? A situation like this is known as a directory traversal attack.

The directory listing feature is enabled by default when you install the Tomcat Web Server for the first time. This feature is useful when you are developing and debugging the Web application to quickly traverse through the application directories and the contents inside. You may not like to use this feature on the production server though. The main reason behind directory listing restriction is to discourage users from navigating through proprietary information.

You can disable the directory listing feature in two ways:

  • Modify the global web.xml file to disable the feature for all Web applications.
  • Modify the web.xml file specific to the application to disable the feature for that single application.

The first way is to open and modify the $CATALINA_HOME/conf/web.xml file. Make sure it contains the following settings for org.apache.catalina.servlets.DefaultServlet.


&lt;servlet&gt;
    &lt;servlet-name&gt;default&lt;/servlet-name&gt;
    &lt;servlet-class&gt;org.apache.catalina.servlets.
DefaultServlet&lt;/servlet-class&gt;
    &lt;load-on-startup&gt;1&lt;/load-on-startup&gt;
    &lt;init-param&gt;
        &lt;param-name&gt;debug&lt;/param-name&gt;
        &lt;param-value&gt;0&lt;/param-value&gt;
    &lt;/init-param&gt;
    &lt;init-param&gt;
        &lt;param-name&gt;listings&lt;/param-name&gt;
        &lt;param-value&gt;false&lt;/param-value&gt;
    &lt;/init-param&gt;
&lt;/servlet&gt;

The second way is to redefine the above servlet mappings, with a different servlet name, inside the web.xml file of a particular Web application. For example, if you want to disable the directory listing feature only for TestApp, then you need to edit the TestApp/WEB-INF/web.xml file. Make sure it contains the following servlet mapping for org.apache.catalina.servlets.DefaultServlet.


&lt;servlet&gt;
    &lt;servlet-name&gt;defaultServlet&lt;/servlet-name&gt;
    &lt;servlet-class&gt;org.apache.catalina.servlets.
DefaultServlet&lt;/servlet-class&gt;
    &lt;load-on-startup&gt;1&lt;/load-on-startup&gt;
    &lt;init-param&gt;
        &lt;param-name&gt;debug&lt;/param-name&gt;
        &lt;param-value&gt;0&lt;/param-value&gt;
    &lt;/init-param&gt;
    &lt;init-param&gt;
        &lt;param-name&gt;listings&lt;/param-name&gt;
        &lt;param-value&gt;false&lt;/param-value&gt;
    &lt;/init-param&gt;
&lt;/servlet&gt;

&lt;servlet-mapping&gt;
    &lt;servlet-name&gt;defaultServlet&lt;/servlet-name&gt;
    &lt;url-pattern&gt;/&lt;/url-pattern&gt;
&lt;/servlet-mapping&gt;

Note that we have changed the value of the <servlet-name> element in the second listing above. This is required to avoid a conflict for two different servlet mappings for the same servlet.

The two techniques I have described here are the basic ways to start with security implementations for Web applications. At the advanced level, several additional checks such as authentication, authorization, confidentiality, data integrity, encryption, Secure Socket Layer (SSL) protocol, and data conversion techniques are needed to strengthen the security of the application.

Hitesh Patel
Hitesh Patel– Project Lead

26
Nov
This entry is part 2 of 4 in the series Securing Java Web Applications

In my last blog, Basics of Securing Java Web Applications, I talked about some basic measures that we can take to secure the content of Java-based Web applications. In this blog, I am extending the discussion on how to secure the application one step further by introducing the Authentication feature. Although Authentication and Authorization are features that usually go hand in hand, only the former has been discussed in this blog to keep it small and easier to understand. In my next post, I shall be discussing how to implement Authorization and how Authentication and Authorization work together.

As in the earlier blog, in this one too, I shall consider Apache Tomcat as the Web server on which the application is deployed.

Authentication and Authorization are two different techniques that are used together to restrict access to special features of a Web application. This restriction is applied on various parts of the application based on the organization roles, also called role-based access control. When deployed correctly, these two techniques together form the base of a secure Java-based Web application, which can be extended further by using other advanced techniques such as Confidentiality and Data Integrity.

I will also cover the definition of the terms ‘Authentication’ and ‘Security Realm’ as well as how to implement Authentication in a Web application here.

What is Authentication?

Authentication is an identity confirmation process, usually of the person trying to use the application. It is the way to ensure that the user is who he claims to be. In case of a Web application, it is the process of verifying the user’s credentials in some way.

Relation between the Apache Tomcat Server and the Security Realm

According to the Java Servlet Specification, a place where security information is stored is called realm. The Tomcat Web server provides a $CATALINA_HOME/conf/tomcat-users.xml file that it loads at the server startup time and uses the information provided in that file to build a memory realm. This file can be modified as needed and used to test security implementations at development time, but on a production server, a JDBC-based realm or JNDI based realm are the best approaches to be used.

The tomcat-users.xml file contains user and role mapping information for a memory realm as described below.

<tomcat-users>
  <role rolename=”tomcat”/>
  <role rolename=”role1”/>
  <user username="tomcat" password="tomcat" roles="tomcat"/>
  <user username="both" password="tomcat" roles="tomcat,role1"/>
  <user username="role1" password="tomcat" roles="role1"/>
</tomcat-users>

To start with the Authentication and Authorization implementation for a Web application, first we need to modify the contents of this file as desired. Let us assume that we would like to drop role1 defined above and instead, introduce new roles called Manager and Guest. We shall also define two new users and assign them some roles. This can be done as described in the code fragment below:

<tomcat-users>
  <role rolename=”tomcat”/>
  <role rolename=”Manager”/>
  <role rolename=”Guest”/>
  <user username="tomcat" password="tomcat" roles="tomcat"/>
  <user username="john" password="smith" roles="Manager, Guest"/>
  <user username="maria" password="" roles="Guest"/>
</tomcat-users>

Once we have defined the realm information, we can start with the Authentication and Authorization implementation for our application.

How to implement Authentication?

A common way to authenticate a user is to ask him to provide his username and password to access the restricted resource.  Another way is to provide a Public Key Security Certificate issued by a trusted security certificate provider (such as VeriSign®) to the user and validating that security certificate every time the user is trying to access the restricted resource.

The first approach of asking a user to provide his username and password can be achieved in several ways. All you need to do is specify the authentication method to be used in the <auth-method> element of the <login-config> element as described below:


<web-app …>
  …
  <login-config>
    <auth-method>BASIC</auth-method>
  </login-config>
</web-app>

There are four possible values for the authentication method used: BASIC, DIGEST, CLIENT-CERT, and FORM. Out of these four types, BASIC, DIGEST, and CLIENT-CERT use the browser’s default login dialog box to collect username and password information. Using the FORM method, you can provide a customized login form as desired. BASIC uses the base64 encoding scheme to transmit username and password information and so it is not as secure as DIGEST or CLIENT-CERT. All Web containers are required to support this authentication method. DIGEST uses a more secure way to transmit login information, but Web containers are not required to support it. CLIENT-CERT is the strongest one, because it uses a Public Key Certificate (PKC) to transmit the login information. Although it is strongest, it is not widely used because a certificate must be distributed to all users in order to enable them to log in to the system. The FORM method provides flexibility to define our own login form, but it does not use any encryption scheme, so it is the weakest of all four. Defining FORM based authentication is little different than other <auth-method> definitions, as described below:

<web-app …>
  …
  <login-config>
    <auth-method>FORM</auth-method>
    <form-login-config>
      <form-login-page>/login.html</form-login-page>
      <form-error-page>/loginError.html</form-error-page>
    </form-login-config>
  </login-config>
</web-app>

When the FORM based authentication method is used, the browser will not display the standard login dialog box as in the other three methods. The developer is responsible for creating a custom login form for that, but with several constraints. Note that in the above configuration that we have specified login.html is the login form. So, we need to create a login.html page in our application’s root directory. We also need to create a loginError.html page in the root directory to redirect the user to this page if authentication fails.

In the login.html page, we must use j_security_check as a form action, the name of the username field must be j_username and name of the password field must be j_password as described below:

<form method=”POST” action=”j_security_check”>
  User Name: <input type=”text” name=”j_username”> <br>
  Password: <input type=”password” name=”j_password”>
  <input type=”submit” value=”Submit”>
</form>

So, as I have explained above, the FORM based authentication method does not use any encryption. To secure the login information, we need to define additional security constraints in web.xml deployment descriptor called <transport-guarantee>. This is a separate topic altogether which I shall discuss in a future blog.

Once the authentication configuration is done in web.xml, the next step is to define the resources that we want to keep restricted. In my next blog, I shall cover this configuration along with the Authorization concept. Once the resources are constrained through the configuration provided in the web.xml file, whenever any user tries to access a restricted resource, the Web container will automatically display either the browser’s standard login dialog box or a login form if a FORM based authentication method is used. When a user submits his login credentials, they will be verified by the application server against the memory realm built by the Tomcat Web Server using the tomcat-users.xml file. In case of the FORM based authentication method, we can collect the username and password information on the server by extracting the values provided in the j_username and j_password fields. We can then use these values to authenticate the user against the JDBC-based realm, or the JNDI-based realm.

I have covered only Authentication concept in this post, but in my next post that I shall be posting in a few days, I will cover Authorization concept to conclude Authentication and Authorization.

Hitesh Patel
Hitesh Patel– Project Lead

02
Dec
This entry is part 3 of 4 in the series Securing Java Web Applications

This blog is a continuation of my previous one, Securing Java-based Web Applications: Authentication, in which I have explained the concept of Authentication with an implemented example of a Java-based Web application deployed using the Tomcat Web server.

I will now cover the concept of Authorization in the same way with an implementation example in this blog. I will also cover the definition of ‘Authorization’ with some code samples along with their explanation. The code samples provided here are in line with the code samples covered in my earlier post.

What is Authorization?

Authorization is the process of defining access policies for the resources we wish to protect. It is the way to ensure that the user has the rights to enter into the restricted area or perform a specific restricted function. In case of a Web application, it is the process of verifying the user’s rights based on his predefined role to access any restricted resource on the system.

In my previous post, I have covered the relationship between the Apache Tomcat Web Server and the Security Realm. I have also explained how to modify the $CATALINA_HOME/conf/tomcat-users.xml file to define users, roles and the mapping between the users and their roles. Additionally, I have also explained how to modify the web.xml file of an application to authenticate the user. But what happens after authentication?

Once the user has been authenticated, a Web container checks if the user is allowed to access the resource that he is requesting. It further checks if the user is allowed to perform the requested operation on that resource. This process is called Authorization. Remember however, that to enable authorization, it is necessary to enable authentication for an application by using the <login-config> setting as explained in my earlier post.

How to Implement Authorization?

It is the job of the deployer to decide which resources should be constrained and which users should be allowed to access those constrained resources. The deployer also defines the kind of operations that can be performed on those constrained resources by a user having a particular role.

The first step towards the implementation of authentication is by defining roles, which can be done by modifying the $CATALINA_HOME/conf/tomcat-users.xml file as explained in my earlier post (which also explains how to define the Manager and Guest roles).

The second step is to define the security roles in web.xml so that Tomcat can map the security roles of an application to the roles defined in the memory realm. If we are using the same names for the roles in an application as defined in tomcat-users.xml, then we just need to map the roles as illustrated below:

<web-app …>
  …
  <security-role>
    <description>This is a Manager role</description>
    <role-name>Manager</role-name>
  </security-role>

  <security-role>
    <description>This is a Guest role</description>
    <role-name>Guest</role-name>
  </security-role>
</web-app>

If we are using different names for the roles in the servlets from the roles defined for an application, then we have to map the servlet-specific (also known as user-defined) roles to the roles defined for the application. For example, assume that we have a servlet called DeleteUserServlet in our application and a developer of that servlet has used ‘Admin’ as a role in the request.isUserInRole(“Admin”) method. At the time of development, the developer had originally thought that ‘Admin’ would be the name of the role defined for the application, but a deployer has come up with a different name for the role – ‘Manager’. In this case, a deployer can map the user-defined role with the application specific roles as demonstrated in the following code snippet.

<web-app …>
  …
  <servlet>
    …
    <security-role-ref>
      <role-name>Admin</role-name>
      <role-link>Manager</role-link>
    </security-role-ref>
  </servlet>

  <security-role>
    <description>This is a Manager role which is equivalent to Admin role</description>
    <role-name>Manager</role-name>
  </security-role>

  <security-role>
    <description>This is a Guest role</description>
    <role-name>Guest</role-name>
  </security-role>
</web-app>

The last step is to define the resource and HTTP method constraints for an application in the web.xml file. Assume that we are developing a shopping cart application and we have DiscardBidServlet, DiscardItemServlet, and AddItemServlet. We will be mapping these servlets in the web.xml file as below:

<web-app …>
  …
  <servlet>
    <servlet-name>DiscardBidServlet</servlet-name>
    <servlet-class>app.admin.DiscardBidServlet</servlet-name>
  </servlet>

  <servlet>
    <servlet-name>DiscardItemServlet</servlet-name>
    <servlet-class>app.admin.DiscardItemServlet</servlet-name>
  </servlet>

  <servlet>
    <servlet-name>AddItemServlet</servlet-name>
    <servlet-class>app.common.AddItemServlet</servlet-name>
  </servlet>

  <servlet-mapping>
    <servlet-name>DiscardBidServlet</servlet-name>
    <url-pattern>/ShoppingCartApp/manage/deleteBid</url-pattern>
  </servlet-mapping>

  <servlet-mapping>
    <servlet-name>DiscardItemServlet</servlet-name>
    <url-pattern>/ShoppingCartApp/manage/deleteItem</url-pattern>
  </servlet-mapping>

  <servlet-mapping>
    <servlet-name>AddItemServlet</servlet-name>
    <url-pattern>/ShoppingCartApp/addItem</url-pattern>
  </servlet-mapping>
</web-app>

Now, we would like to apply the constraint that DiscardBidServlet and DiscardItemServlet should only be accessible by a user with the ‘Manager’ role and AddItemServlet should be executed only by a user with a ‘Guest’ role. This can be achieved as described below.

<web-app …>
  …
  <security-constraint>
    <web-resource-collection>
      <web-resource-name>ManagerConstraints</web-resource-name>
      <url-pattern>/ShoppingCartApp/manage/*</url-pattern>
      <http-method>GET</http-method>
      <http-method>POST</http-method>
    </web-resource-collection>

    <auth-constraint>
      <role-name>Manager</role-name>
    </auth-constraint>
  </security-constraint>

  <security-constraint>
    <web-resource-collection>
      <web-resource-name>GuestConstraints</web-resource-name>
      <url-pattern>/ShoppingCartApp/*</url-pattern>
      <http-method>*</http-method>
    </web-resource-collection>

    <auth-constraint>
      <role-name>Guest</role-name>
    </auth-constraint>
  </security-constraint>
</web-app>

In the above code snippet, we have declared the URL with the pattern /ShoppingCartApp/manage/* as a constrained resource. We have further fine-tuned the constraint only for the GET and POST HTTP methods. It means that whenever a Web container receives a request for a URL that contains the above pattern, if the request is for GET or POST HTTP methods, then it will first check if the user is authenticated.

Authentication for an application is explained in my previous post. The container will further check if the user who has requested this URL has the role of a ‘Manager’ by checking the user-role mapping in the memory realm built using the tomcat-users.xml file. The servlet will be allowed to serve the request only when the above details are verified, and not otherwise.

In the above configuration, the URL pattern and HTTP methods together define which resource requests are constrained to be accessed only by the roles that are defined in the <auth-constraint> entry. Since only the ‘Manager’ role is specified in the first <security-constraint> entry, any user who does not have the role of a ‘Manager’ will not be allowed to execute GET and POST requests on the URL pattern /ShoppingCartApp/manage/*.

In the second <security-constraint> entry, we have constrained the URL pattern /ShoppingCartApp/* for all HTTP methods by specifying the ‘*’ wildcard character for the user with a ‘Guest’ role. This means that only users with ‘Guest’ roles will be allowed to access all the HTTP methods on the URLs with /ShoppingCartApp/*. Remember however, that the first <security-constraint> entry still restricts ‘Guest’ users to access ‘Manager’ specific URLs.

The <url-pattern> entry in <web-resource-collection> is mandatory; however, it is possible to specify more than one <url-pattern> in a single <web-resource-collection>. This way, we can constrain multiple URLs for the same set of roles for the same set of HTTP methods.

The <http-method> entry is optional. If omitted, all the HTTP methods will be constrained, which means that only the roles specified in the <auth-constraint> entry can access any of the methods. If used, only the specified HTTP methods will be constrained, and not the rest.

More than one <web-resource-collection> can be specified in the same <security-constraint>. This way, we can constrain multiple URLs for multiple HTTP methods for the same set of roles.

The <role-name> element within the <auth-constraint> element is optional. If it exists with specific role names, only those roles will be allowed to access the constrained resources. If the wildcard character ‘*’ is used for <role-name>, then all the users are allowed to access the constrained resources. If the <role-name> entry is omitted from the <auth-constraint> element, none of the users will be able to access the constrained resources.

The <auth-constraint> element within the <security-constraint> element is optional. If it exists, the container must perform authentication and authorization for the constrained resources. If omitted, the container must allow unauthenticated access to the constrained resources.

This concludes the implementation of Authentication and Authorization for a Java-based Web application. In my upcoming blog, I shall be discussing how to use servlet filters to deal with concerns (also known as Aspects) such as Logging and Security in an application.

Hitesh Patel
Hitesh Patel– Project Lead

17
Dec
This entry is part 4 of 4 in the series Securing Java Web Applications

Introduction

There are often requirements for a Web application to provide the functionalities that are either for an administrative purpose, for a special group of users, or simply to improve the overall response time for a specific type of response for the application. For a Java-based Web application, we have the facility of using filters and wrappers to serve these functionalities. Filters and wrappers are powerful features provided by servlet API to implement cross-cutting features such as logging, auditing, keeping track of user activities, zipping a large file before sending it to a user, or creating a different response altogether.

In this blog article, I am going to discuss the definition of a filter and how we can use the filter API in an application.

What are Filters?

Filters are Java components available in servlet API that are used to intercept the request and response generated by a servlet. The functionality and implementation of a filter are very similar to that of a servlet.

How to implement a Filter?

There is a single interface called Filter available in the servlet API to implement a filter. This interface declares the lifecycle methods for a filter. There are three lifecycle methods called init(), doFilter() and destroy() that we need to implement in a filter class as demonstrated in the following code snippet.


package test;

import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;

public class TestFilter implements Filter {
  private FilterConfig config;

  public void init(FilterConfig config) throws ServletException {
    this.config = config;
  }

  public void doFilter(ServletRequest request,
                       ServletResponse response,
                       FilterChain chain)
                          throws ServletException, IOException {
    HttpServletRequest req = (HttpServletRequest) request;
    …
    // Call the next filter in the chain
    chain.doFilter(request, response);

    // Process the response if required
    …
  }

  public void destroy() {
  }
}

As we can see, the implementation of a filter is very similar to that of a servlet, with a minor difference in the execution of a filter. The fact is, we can have multiple filters associated for a given URL pattern or a servlet, and the container then chooses which filter to execute and in which order based on the configuration provided in the deployment descriptor. There are certain rules followed by the container that decides which filter will be executed next.

The call to chain.doFilter() in the code snippet above is an instruction to the container that this filter has finished its work and the request is ready to be passed on to the next filter in the chain. If there is no other filter remaining in the chain, then the container will pass on the request to the actual servlet. Similarly, when the servlet finishes with the response, the container will pass on the response to the last filter called and the execution begins from the statement after the chain.doFilter() call.

We need to register and map filters for a URL pattern or for a servlet in a deployment descriptor as illustrated in the following code snippet. We will map TestFilter declared above with two more filters, Filter1 and Filter2, with the ‘*.do’ URL pattern, and TestFilterServlet to understand how multiple filters are mapped and executed.

<filter>
  <filter-name>TestFilter</filter-name>
  <filter-class>test.TestFilter</filter-class>
</filter>

<filter>
  <filter-name>Filter1</filter-name>
  <filter-class>test.Filter1</filter-class>
</filter>

<filter>
  <filter-name>Filter2</filter-name>
  <filter-class>test.Filter2</filter-class>
</filter>

<filter-mapping>
  <filter-name>TestFilter</filter-name>
  <url-pattern>*.do</url-pattern>
</filter-mapping>

<filter-mapping>
  <filter-name>Filter2</filter-name>
  <servlet-name>FilterTestServlet</servlet-name>
</filter-mapping>

<filter-mapping>
  <filter-name>Filter1</filter-name>
  <url-pattern>*.do</url-pattern>
</filter-mapping>

Here, we have configured TestFilter and Filter1 for a URL pattern ‘*.do’ and Filter2 specifically for  FilterTestServlet. When more than one filter is mapped to a single URL pattern or a servlet, the container executes all URL related filters first in the order specified in the deployment descriptor, followed by all the filters mapped to a specific servlet in the order specified.

In the example above, if the request is for a FilterTestServlet and if it ends with a ‘.do’ extension, then the container executes TestFilter first and calls its doFilter() method. When the execution reaches the chain.doFilter()call in TestFilter, the container executes Filter1, because it too matches the ‘.do’ extension. Similarly, when the execution reaches the chain.doFilter() call in Filter1, the container executes Filter2, because it is declared as a filter for FilterTestServlet. When the container encounters the chain.doFilter() call in Filter2 and it identifies that there are no more filters in the chain to be executed, the request is passed on to the actual servlet TestFilterServlet. When the TestFilterServlet finishes, the container will execute the filters in the LIFO order. In our example, the container will execute Filter2, Filter1, and TestFilter in that order. Remember, this time the execution begins at the statement that follows the chain.doFilter() call in the filter.

Since filters are configured for a URL or a resource through a deployment descriptor, it is very easy to change the execution order, or to add new filters and remove the existing ones without modifying the actual servlet code. Using filters, you can easily add common functionalities such as logging, auditing, security, user tracking, etc. across the application without touching the servlet code.

Conclusion

There is more to the filter API to modify the request and response objects before the actual request is passed onto the servlet or the actual response is passed onto the client. We have wrapper classes called ServletRequestWrapper, HttpServletRequestWrapper, ServletResponseWrapper, and HttpServletResponseWrapper in servlet APIs using which we can provide custom request and response objects. For detailed explanation of filters and wrappers, I suggest reading The Essentials of Filters provided by Oracle.

Hitesh Patel
Hitesh Patel– Project Lead

18
Aug

While working on a project, I came across a requirement wherein I had to write a query to retrieve the Nth maximum (salary) of the records in a database. Sounds simple before starting to write the query but as and when progressed, I realized the need to find an efficient solution as compared to the ones normally used by all.

Many of us have come across some sort of a disarray in retrieving the nth maximum record in a given database. Most of the PL-SQL developers and DBAs are in continuous efforts to execute enormous queries in order to get the desired output. I have myself encountered several critical issues during PL-SQL programming on the projects which I work. But out of those, I am keen on sharing one such case in point which shows ever pertinent issue of finding out the nth maximum record in a database.

Nth maximum record:— It means retrieving the second or third or tenth maximum record for a given database.In my case, I had to retrieve nth maximum salary from a particular persons table i.e. employees of a company.
There may be several ways to retrieve the above output. But a few developers include ‘rownum’.   in their queries which  is probably not going to give the correct result.

Lets us see for instance,

A query normally written by most-
select * from (select persons.*, rownum as r1 from persons order by salary desc) where r1=2
The result of the above query is totally based on the way of records inserted into the table.
e.g. — The structure of the table persons is
CREATE TABLE PERSONS
(
P_ID INTEGER NOT NULL,
LASTNAME VARCHAR2 (255 BYTE) NOT NULL,
FIRSTNAME VARCHAR2 (255 BYTE) NOT NULL,
ADDRESS VARCHAR2 (255 BYTE),
CITY VARCHAR2 (255 BYTE),
SALARY NUMBER
)

Following are the insertion statements for the persons table:

- INSERT INTO PERSONS ( P_ID, LASTNAME, FIRSTNAME, ADDRESS, CITY, SALARY ) VALUES ( 3, ‘first’, ‘ankush5′, ‘city1′,   NULL, 500);

-INSERT INTO PERSONS ( P_ID, LASTNAME, FIRSTNAME, ADDRESS, CITY, SALARY ) VALUES ( 1, ‘second’, ‘ankush8′, ‘city2′, NULL, 80);

-INSERT INTO PERSONS ( P_ID, LASTNAME, FIRSTNAME, ADDRESS, CITY, SALARY ) VALUES ( 4, ‘third’, ‘ankush’, ‘city3′, NULL, 90);

-INSERT INTO PERSONS ( P_ID, LASTNAME, FIRSTNAME, ADDRESS, CITY, SALARY ) VALUES ( 2, ‘fourth’, ‘eankus’, ‘city4′, NULL, 600);
COMMIT;

If you keenly observe the above statements, there is no order of inserting the value of salary in persons table. In short the inserting values of salary are in random manner i.e not sorted. The person with salary of 500 comes first and 80 second and 600 in last as per the input in the database.

And, if a developer is to execute the query which includes rownum to retrieve the maximum second salary,
select * from (select persons.*, rownum as r1 from persons order by salary desc) where r1=2 order by salary desc.
Then, the output would be 80 because rownum for that row is 2. And the reason for the rownum =2 for this record just because it was inserted in 2nd place. So the logic of using rownum for retrieving the maximum 2nd salary is quite unjustifiable.

After many permutations and combinations, I finally arrived at what according to me is an accurate way of writing the query.That would be using dense_rank.

A query written using dense _rank:

SELECT * FROM (SELECT persons.*, DENSE_RANK() OVER (ORDER BY salary desc) s_dense_rank FROM persons ) WHERE s_dense_rank = 2

If I execute this query to retrieve the maximum second salary, I am most certain to get my desired output i.e. 500 even though it was inserted first in the table.

In Oracle/PL-SQL, the dense_rank function returns the rank of a row in a group of rows. The dense_rank function can be used two ways – as an Aggregate function or as an Analytic function. In my problem, I have used dense_rank as an Analytic function.

This is one of the ways I could arrive at. Please leave your comments and if you have any such similar situation and related solutions, please share them across to have a fruitful interaction.

Sachin Padha
Sachin Padha– Software Developer ( PL-SQL Developer )

, ,