Understanding Intention

Using Content, Context, and The Crowd To Build Better Search Applications

+ Full Text

U N D E R S TA N D I N G I N T E N T I O N •

Understanding Intention U S I N G C O N T E N T, C O N T E X T, A N D T H E C R O W D T O B U I L D B E T T E R S E A R C H A P P L I C AT I O N S

1

2

U N D E R S TA N D I N G I N T E N T I O N •

Table of Contents introduction — So, You’ve Got All This Data

3

chapter 1 — Search is Easy

6

chapter 2 — Enterprise Search Isn’t So Easy

9

3

U N D E R S TA N D I N G I N T E N T I O N • I N T R O D U C T I O N

INTRODUCTION

So, You’ve Got All This Data you’ve probably heard the statistic: By the year 2020, gigabytes will outnumber humans 5,200 : 1. Usually accompanying this breathless warning is a Friedmanian mixed metaphor representing the infinite: grains of sand, haystacks, stars in the universe, craft beers, Brett Favre interceptions, you name it. In this age of information, simply going about our daily routines means creating and collecting a staggering amount of data.

A typical American office worker produces

1.8 MILLION megabytes of data each year or about

5,000 megabytes a day.

Source: Technology Review

4

U N D E R S TA N D I N G I N T E N T I O N • I N T R O D U C T I O N

In just a few years, we’ve broadly accepted the notion that accumulating and storing all this data (instead of deleting it) is tremendously valuable. But with so much of it, we’re now worried we won’t be able to capture all of its value.

the data b o o m lo o m s 20

19

EXABYTES

15

14

10

10

8.6 7.1

7 5.8 5 3.1

4

3.8

4.7

2

2013

2014

DATA C E N T E R S TO R AG E

Source: Cisco

2015

2016

2017

C O N S U M E R C LO U D S TO R AG E

2018

5

U N D E R S TA N D I N G I N T E N T I O N • I N T R O D U C T I O N

From Data Intrigue to Data Fatigue “where are we going to put all this stuff?” An EMC survey of 800 IT professionals found that 4 of their top 5 anxieties involved the storing, accessing, and security of their data. What’s clear is that the stark reality of commodity storage is quickly replacing the fantastical utopian future of big data. “how do i use all this stuff to make informed, strategic, and big-picture decisions?” Scaling and storage are solved problems. Now what? The fifth concern voiced in the EMC survey was capturing the supposed game-changing value of data. In other words, I can see all the Brett Favre interceptions, but there are so many that they make my eyes glaze over. How can I pick up on patterns that’ll tell me when and why those turnovers are happening? The promise of big data isn’t gone. It’s just changed from “Where do we put all of this stuff?” to “How do we make all this data accessible to users in a meaningful way?”

Source: EMC

20 13/20 14 sto r age a nxieties identified by it pr o fessio na ls ANXIETY

P E R C E N TAG E

Managing storage growth

79%

Designing, deploying, and managing backup, recovery, and archive solutions

43%

Making informed strategic/ big-picture decisions

39%

Designing, deploying, and managing disaster recovery solutions

38%

Lack of skilled storage professionals

37%

Designing, deploying, and managing storage in a cloud computing environment Designing, deploying, and managing storage in a virtualized server environment Lack of skilled cloud technology professionals

29%

18%

15%

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 1

CHAPTER 1

Search is Easy Search: From box to entry point

6

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 1

1,000,000,000,000 in 2014, google alone handled more than 1 trillion searches. Factor in other Web behemoths like Amazon, Facebook, YouTube, and Twitter, and it becomes clear: Search technology touches every part of our daily lives, from how we shop, eat, and date, to how we consume, communicate, and celebrate.

The success of the consumer Web proves that search is the entry point to extracting value and meaning from a nearly infinite (and infinitely growing) amount of data. Looking for the search box as our starting point is second nature. We have faith in its ability to discover what we want to do or know and who we want to talk to.

Source: Google

7

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 1

As consumers, we navigate the data deluge aboard the USS Search. Given the obvious benefits, elevated stakes, and advantage of private investment, you'd think it'd be even easier for enterprises.

And you would be wrong.

8

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

CHAPTER 2

Enterprise Search Isn’t So Easy Shifting from storage to value means focusing on ease of use

9

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

Businesses are People, Too unlike consumer search, which has become a seamless part of our everyday lives, the enterprise side might as well still be running Windows 95. Imagine if Amazon, Google, or Facebook treated every user the same, regardless of who they are, where they are, what they’re searching for, and what they’ve clicked.

Your users expect that same sophistication in their enterprise apps.

10

11

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

Easy Isn’t Enough It has to be smarter.

SEARCH

SEARCH

12

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

People Want Action

Restaurants where you’ve made OpenTable reservations in the past are highlighted in your Google Maps.

Premieres of your favorite TV shows pop up on your calendar.

While browsing Spotify, a discounted ticket offer shows up for a movie you searched for the day before.

A fitness wearable detects your blood pressure on the rise and schedules a gym visit on your calendar.

Your friend recommended the lasagna at a particular restaurant.

13

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

Relevancy is King The key to relevancy is understanding your users’ intentions. Relevancy in a search app is comprised of 3 main parts.

content

context

crowd

Data and documents

Individual history and behavior

Similar users

14

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

Content content refers to documents and data : all of the stuff you want to index, search, and retrieve. Content comes in 2 main formats:

  1. structured data: Spreadsheets, databases, lists, network logs, and anything else that looks like a table

Content comes from many sources, including: • network drives

• news services

• intranet

• insurance claims

• wikis • support tickets • cloud storage

  1. unstructured data: PDFs, documents, presentations, scanned documents, instant messages, emails, webpages, audio, video, and anything that doesn’t fit neatly into a tabular format

• on-premise servers • vendors, partners • hard drives • mobile devices • email servers

• banking activity • call detail records • stock tickers • network logs • social media streams • medical records

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

Ability to access data is enterprise content’s primary challenge. To make sense of the data, it needs to be indexed, linked with your search apps, and made accessible to the correct users. This is what relevancy means for enterprise search. Content can be used to drive relevancy when access controls are enforced and rich metadata is present — such as content classifications, author, and subject fields. In an ideal world, we’d be able to tap into our data in real time from any device.

15

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

Context overall, the content part of the puzzle is solved. Technologies like Hadoop, Solr, and NoSQL have made accessing, indexing, and scaling data easier and more cost-effective than ever. The challenge, then, becomes zooming in. Not only do you need to be able to see the information within the documents and data itself, you need to understand how different files relate to each other and, further, how they relate to you (and other users like you). The second element of the relevancy trio is context.

Along with analyzing how all the bits of data are interrelated, there’s the question of relatability. When a search app knows more about you, it can create a relevant search experience that helps you get personal, actionable search results on a consistent basis.

Search apps have solved that problem with signal processing. A signal is any bit of information that tells the app more about who you are. Signals can include your job title, business unit, location, device, and search history, as well as past actions within the search app like clickstream, purchasing behavior, direct reports, upcoming meetings or events, and more.

16

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

17

Crowd you’re nearly good to go: All your data is in one place, available at any time, and your app is personalizing search results to provide deeper meaning and context to your requests. But you’re not in uncharted waters. No doubt, others before (other users like you ) have searched for similar things and navigated to what they needed. So, what about everybody else? Where did their searching take them? The final point of our relevancy triad is the crowd. When a search app uses the crowd, it goes beyond documents and data, past your specific user profile and relationship, and examines how other users are interacting with the data and information.

A search app knows the behavioral information of thousands — sometimes millions — of other users. By keeping track of every user, search apps can bubble up what you will find important and relevant and what other users like you will want, too. The tech uses its knowledge of your office, role, and demographic to match to the same in other users and make intelligent judgments about what will help you the most.

18

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

Content, Context, Crowd This holy trinity — content, context, and crowd — is already a huge part of the success of consumer-facing search apps. let’s take a look at how these 3 ingredients can impact relevancy and importance on the enterprise side.

co nt ent

crow d

co ntex t

19

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

EXAMPLE 1

Ecommerce Brand names, product descriptions, store availability, product category content

Cart and purchase patterns, similar customers’ purchases crowd

context

Clickstream, user demographic, loyalty program participation, in-store vs. online behavior, wish list

20

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

EXAMPLE 2

Security + Compliance

content

Audit logs, change requests and work tickets, claims logs, access logs, network logs, emails, instant messages, email attachments

context

crowd

Location, time, sensitivity of content, attempted action, success or failure, associated agent role, business unit

Behavioral patterns, user in role/ job function (e.g., lots of failed logins to a payroll system from a marketing user)

21

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

EXAMPLE 3

Enterprise Search

MS Office documents; files in the cloud, intranet, and wiki content

Other business unit users, regionally popular documents crowd

Job role, business unit, security access context

22

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

EXAMPLE 4

Fraud Transactions, reported travel plans, verifications of previous transactions, insurance claims content

crowd

context

Location, transaction size, in-person vs. online, seller, items purchased, IP address, country of origin

People with similar customer history; others buying similar items; others in the same location; patterns of fraudulent behavior, users in similar divisions, regions, or business units

23

U N D E R S TA N D I N G I N T E N T I O N • C H A P T E R 2

EXAMPLE 5

Customer-Facing Data/Apps

content

User behavior, purchase history, social signals, referrals, search history, viewing history (Web/app logs), click path, search requests, geo pings

context

crowd

Items viewed, items abandoned, customer location, time of day, content related to things around your current location; free vs. paid customer

People who looked at the same thing, people who bought the same thing, other similar items, similar people (who have bought or looked at the same things), others nearby, aggregated behavior patterns of users

U N D E R S TA N D I N G I N T E N T I O N

This has been a Lucidworks production. Lucidworks builds enterprise search solutions for some of the world's largest brands. Fusion, Lucidworks' advanced search platform, provides the enterprise-grade capabilities needed to design, develop, and deploy intelligent search apps—at any scale. Companies across all industries, from consumer retail and health care to insurance and financial services, rely on Lucidworks every day to power their consumer-facing and enterprise search apps. Lucidworks' investors include Shasta Ventures, Granite Ventures, Walden International, and In-Q-Tel. learn more at lucidworks.com. Sources, in order of first use: Technology Review, Cisco, EMC, Google

Share this e-book with a colleague or friend.

©2015 Lucidworks. All rights reserved.

24

U N D E R S TA N D I N G I N T E N T I O N •

25