Fresh Search Results with SharePoint

Fresh Search Results with SharePoint

How fresh do you want your search results?

Introduced with SharePoint 2013 we have got a new feature for Search in order to support better content freshness. Its called continious crawling. The name suggests SharePoint is crawling and processing content continiously. The truth is – by default – it is a type of incremental crawling in 15 minutes intervalls with a few extras.

Once we know something, we find it hard to imagine what it was like not to know it.

Chip & Dan Heath, Authors of Made to Stick, Switch

Continious crawling

…is a type of crawl with the purpose to maintain the index as current as possible by fixing the shortcommings the incremental crawling has.

Incremental crawling

…is sort a of crawling where existing content in the index is being crawled again – e.g. picking up changes.

Full crawling

…kicks off content discovery of the entire content source.

 

30 Minutes

15 Minutes

Every Sunday

Use continious crawling for SharePoint content – it is not availible for indexing external content e.g. BCS, File Share and Websites. With continious crawling enabled you will benifit from parallel indexing of content.

Imagine some major changes are happening on the content and the crawling needs more time than usual to process them. With continious crawling the next crawl wont wait for complition – he will kick off as scheduled and will process the latest changes – crawling is running in paralel.

Situation

Simon is a project manager and is introducing a lot of changes to large documents in a short time. During the changes the SharePoint crawling is kicking in and while this happens, Anita, who is working in finance is uploading her calculation and expects it to be aggregated and displayed in the finance portal by search driven webparts.

Without continious crawling

Anita has to wait for the current crawling to be finished, so that the next crawling can start and processing her calculation sheet. On some environments even incremental crawls can take up to 60 minutes. So in worst case Anita has to wait [60 Minutes + Time for the current crawl]. This behaviour will irritate the users – depending on how heavily the portal relys on search – it will likely cause some headaces to the management as users will complain if your portal is completely search driven.

With continious crawling

While a crawl is running to process the “deep” change, another crawl kicks in after 15 Minutes in parallel and is eventually processing Anitas excel sheet. Even if one crawling needs longer due to processing of “deep” changes, another crawl will begin its work as per schedule, without being bothered by any other crawls.

Use continious crawling on all SharePoint content sources and if your farm has the power .e.g. is fast responding to user requests during crawls and you want users to have super fresh search results – change the continious crawling schedule. It is 15 Minutes per default.

 

$IntervalMinutes = 10;
Write-Host "Changing Continuous Crawl Interval to $($IntervalMinutes) minutes"
$ssa = Get-SPEnterpriseSearchServiceApplication
$ssa.SetProperty("ContinuousCrawlInterval",$IntervalMinutes) 
$ssa.Update()
$interval = (Get-SPEnterpriseSearchServiceApplication).GetProperty("ContinuousCrawlInterval") 
Write-Host "New continuous crawl interval set to $interval"
Changing continuous crawl interval

Analysing Storage Performance

Analysing Storage Performance

A critical view on the storage subsystem with DiskSpd

When it comes to SharePoint Performance – a fast Storage is key, so how do we measure storage for one and how we apply our usage pattern?

Microsoft superseeded SQLIO since DiskSpd release came out in 12/14/2015.

So we are dealing with DiskSpd from now on. In compaprsion to SQLIO DiskSpd brings a few (to myself) intesting features to the table.

 

New features:

  • Consumable XML output for automation support e.g. Scheduled analysis runs throughout the day powered by PowerShell
  • Custom CPU affinity options
  • Synchronisation and tracking functionality
  • Ability to target also physical disks
  • Variable read/write ratio

Purpose of DiskSpd

With DiskSpd we are Simulating Workload – specifically for SQL.
We are generating lots of IOPS. – some might say here Ayoub’s – whitch is my name and sounds very funny actually

To have clean tests:

  • If you are using iSCSI LUNS or SMB shares, you depend on the Network – make sure you are “alone”
  • If you are using SAN, make sure you dont have any other Systems consuming the shared resources – reduce the noise as much as possible.

New features:

  • Consumable XML output for automation support e.g. Scheduled analysis runs throughout the day powered by PowerShell
  • Custom CPU affinity options
  • Synchronisation and tracking functionality
  • Ability to target also physical disks
  • Variable read/write ratio

 

So let’s get our brain working with some more parameters and their meanings flying around. Strap your seatbelt – i am about decrypt a few things and put it in context with the real word.

What’s likely your setup?

You are running your servers on top of a virtualisation layer e.g. ESX / Hyper-V and your underlying storage….could be anything. It doesn’t really matter to us, as we don’t want to dig around into the storage architecture corner. But we need to know a few things from the storage engineers.

  • What is the block/stripe unit-size on the storage?
  • What is the blocksize on the guest ?

Got the feedback ? Blocksize on guests vs storage should be same. Take blocksize of guest or get the disks re-provisionend. Oops.

Alright that’s it…  but you can check it out by yourself, to be sure.

Run in any administration shell fsutil fsinfo ntfsinfo d:

Take Bytes per cluster / Storage offset = blocksize

 

Ideally you are having 64k block size on the storage and on the guest.

If you are dealing with SQL and you use iSCSI LUNs, format them as 64k, attach separate LUNs, and support separation by OS, Data files, Log files.

Hint: If you are on Hyper-V with motion enabled, ensure that vmdk anti affinity realm is doing its job and your are preventing to eventually having your drives sitting altogether on one LUN.

Let’s get started and download DiskSpd here.

Put it on drive C and use the following parameters.

  • -h disabling OS caching like the SQL server does
  • -t8 Number of threads – adjust this if you know the code and know how the app is talking to your sail box. If you have a chatty black box – leave it at 8 or even increase it
  • -c1G size of the data file in gigabytes. Leave it on 1G if your are dealing with SharePoint for example.
  • -w25 25% writes vs 75% reads – you are invited to play with this values.
  • -o8 queue depth/length per thread (no of remaining tasks in the queue)
  • -b64K Block size of your disks
  • -d60 duration of the test in seconds
  • -Z1G Workload test write source buffer for supplying random data for our write operations
  • -L Capture latency – we really want this

Hint

Before you do anything on the Systems in your corporation – align with the Sys Admins first and tell them what you are doing and let them know the impact of the testing. They will likely schedule this with you off hours when live systems will be affected.

.\diskspd.exe -b64K -d60 -o8 -t8 -h -r -w25 -L -Z1G -c1G c:\\io.perf

La voila – have fun with the data.

 

Your are interested in:

  • Latency
  • I/O per secound (read & write)
Part 1 SharePoint Documents – the unbroken link

Part 1 SharePoint Documents – the unbroken link

Intro

I have been working on SharePoint projects for more than 10 years, starting with SharePoint 2003 in 2004.

Many years and features later – one of the most relevant capabilities of SharePoint is it to enable the business to store and work with documents – we always tend to oversee the very basic need.

No Attachements

So instead of trying to think about new features our customers might find cool and useful… take a few steps back and take a look at the document handling.

Basically what happens in a project that’s about introducing SharePoint as collaboration platform into an Organisation is, that you always have a major objective (written or implicit) where you first want to get rid of sending documents attached in mails.

So instead of putting attachments into our mails, we upload the document to SharePoint, we start adding a link to them in our mail ´, we teach others to do the same and because parts of the new collaboration world is not so static as we thought so, we move library’s & sites around and at best, we change the url of the collaboration space.

The Pain

Eventually we will end up in a situation where more than half of our document references sent by mail end up as broken links.

The half of the team knew that this will get problematic and the other half just felt out of their blue sky – not knowing on how to handle situations like that.

Not so problematic you say?

Well – you can collaborate with external clients & partners. Do you want to sent them the new link to the documents over and over again? No? I thought so. 😉

Solution

03-11-2015 00-26-23

This is part one of the series.

I am writing part 2 where i will explain what SharePoint 2013 and SharePoint 2016 brings to the table.

Office Delve in der Organisation mit SharePoint 2016

Office Delve in der Organisation mit SharePoint 2016

Mit Office Delve sagt Microsoft Daten- & Informationssilos den Kampf an – alles noch work in progress sozusagen, aber es ist extrem spannend was da gerade passiert.

Seit dem September 2014 verändert sich Office Delve stetig – es ist eine neue Strategie einen vereinfachten Blick auf komplexe Daten und Informationsströme zu schaffen – ohne vorher zu suchen…

(more…)

%d bloggers like this: