Thursday, November 7, 2019

The White Box Obsession in Enterprise Networking - An Attempt to Put End to The Madness

Image result for Whitebox

Network Disaggregation & White Box switching are one of those Talk of the Town things for quite some time. There are lot of startups in this space or niche players is one of those over hyped marketing terms to address that segment of OEMs besides other ones such as SDN, IBN & Cloud.

At surface they seem to bring couple of new factors into the equation which at first sounds very innovative such as:

Your Networking Hardware should be treated as Commodity similar to Server market. So idea is simple, you should be able to buy any X-86 based Networking gear and should be able to run NOS (Network Operating System) on top of your choice. In theory the NOS could be purpose built to meet your specific needs. So again , in theory you should end up paying for only the features of Entire NOS that you are using. Which at surface seems bit different from the way you used to pay to Networking Vendors in past. 

So essentially in theory you end up with:

1. Cheaper Hardware
2. Standardized HW which run on x-86 Architecture instead of on Proprietary Vendor Chipset
3. NOS (Open Source or Vendor NOS) that you can run on Whitebox
4. In theory NOS should also be cheaper as you might only pay for what you use

So on surface those promises are very interesting indeed. 

But...

If that's the case, why Whitebox idea is not that successful so far at least in Enterprise Segment ? 

Well let's look deeper and travel back into time to find some details. Where probably it all started ?

Well it started as different movements at different places. Some old school of thoughts around SDN are good starting points, it's just one of interpretation what SDN means for many still.

At the same time Webscales of the world were having different problems to address:

1. Pace of innovation - Now though Innovation interestingly is one of those over hyped and overly abused terms just like Digital Transformation. Let's say in simple terms in our context it means at what pace I can build features and function or integrations. 

Let's say your specific environment needs some very specific Add ONs on top of an existing feature (Though in theory you might want a new feature). Now this was and probably still is one of the most important consideration for large webscales and cloud providers as requesting these add on or features to vendors takes lot of time usually. The vendor on the flip side might not be at fault if you are big enough and giving them enough business or the frequency of these kinds of requests is not high enough. The moment you don't find that right balance, your requests will likely be pushed back or thrown onto potential road-map presentation slide.

The another roadblock could be compliance and privacy on both sides seen as an issue under co-innovation. 

But do you want to go that far as an average Enterprise ?

2. Who wants to keep the Control - Now this can also be looked at through different lenses and innovation can be seen as just a subset of it. 

Now let's look at simple perspective here:

Q: Who Pays your bills ?
A: Business/Job

Q: How Business makes money ?
A: By running Apps which are managed through People & Process which helps it make some money

Now you must understand that how fundamentally Webscales are different from Enterprises. At very basic level Webscales build applications first and later design and build infra to best suit those Apps. Also most APPs are written by Webscale staff itself.

On the flip side:
Most Enterprises have lot of Organic Growth. Where in you Build Infra first (The minimum capability) and run applications on top which in most cases are sourced from Vendors. So :

A. We pick infra first and later try to best lay applications on top.
B. You as an Enterprise don't have much access to make changes to Apps itself for most part. Depending upon your size you can either go through same cycle which I described under Point 1 or You can create and build abstractions on top which brings another set of complexities and considerations into the mix. 

3. Platform Independence - Now while the idea sounds interesting and marketing really seems to have done its job well on different social media platforms. Think deeply if it makes any difference in your context or if Networking Gear vs. Server Gear really needs to be compared. Try to map it back to your Organisation's IT strategy and see if it makes any sense. Another interesting problem here could be :

- If something breaks and you don't know if it's HW or SW, How you stop finger pointing which is one of Operational Challenge you may face. Now the possible argument could be Whitebox vendor eco-system as an answer to this problem. But wait, were we not looking complete independence here though Disaggregation ? So that's another tradeoff if you look closer.

4. Cheaper Hardware & NOS - Well x-86 boxes being cheaper looks promising. But you must understand most of your current vendors don't make money from selling Hardware anyways for most part. Depending upon your size as Customer and loyalty you bring on the table for your respective OEM, I have seen discounts anywhere between 60% - 80% and in exceptional cases 90% to 100% depending upon the play. A good example would be Vendor being into Software lead Networking solutions space. The vendor can also give you lots of free stuff initially under it's pull through strategy in other different ways for which otherwise you may end up paying heavily such as Professional Services. Also most modern & Traditional Networking Vendors offers their NOS under different subscription models and different licensing tiers. Which sets the balance here.

Now assuming you still want to continue with " Whitebox Strategy ", here is another good set of considerations:

1. RMA Strategy - Dig deeper into Whitebox vendor RMA strategy and SLAs as a relatively new player might not be able to match expectations. It's usually a larger problems for Organizations with Global footprint.

2. Quality of TAC Support - Assuming vendor is relatively new in the Industry, A good assumption would be their HW or NOS being not mature enough. Over period of time as they grow, people put their stuff into interesting environments and pushes the limits. Also any given OEM TAC would rely upon not only experienced staff but also how large your Database is. 

3. Existence - The rapid pace at which industry has evolved in last couple of years, we see some traditional players going out of business, being taken over by large players where niche players are bought by Vendors with deep pockets and M&A scenarios. So your favorite Whitebox vendor or NOS vendor may not last in Industry for different reasons. So ROI and Future protection are key considerations here.

4. Re-Skilling - Your existing Operational staff has likely no prior experience to operate this new way of thinking or working per say. But that's not where it stops, During the transition/migration you may end up maintaining this Hybrid environment (OLD + NEW) and if you haven't put enough thought process into this and got your SOWs, Run Books etc. revised it may cause more damage than good. And of course understand learning curve and it's tradeoffs. 

5. Platform Performance & Integration - Most enterprises usually at least in my experience don't have Platform testing teams in house and kind of rely on vendors which works based TRUST MODEL. From a traditional vendor perspective you at least have some level of control if at later stage the Gear doesn't meet the expectations. In a disaggregated world it kinds of gets tricky as HW and SW comes from two different vendors and Finger pointing likely doesn't go away. The moment you again apply the Eco-system logic, you run into another Anti-Pattern which was complete independence. 

6. Reach - Most people don't consider this but reach / ecosystem of System Integrators is another good consideration. Major SIs in market of course work with Major OEMs. Which means a relatively new SI may neither match the reach specially in Global context nor the quality of staff.

7. Experience - Being a new way of doing stuff, how would you compensate for experience

8. New Security Surface - Every platform and NOS has its own security surface which needs to be addressed by Organisation's (Client) IT Sec policy or Cyber Security policy whichever terminology you may prefer. How your favorite Whitebox and NOS vendors will fit into this existing ecosystem is an important consideration and most CISOs likely don't have first hand experience with such environments or not enough experience perhaps. So you as a Network Team is giving him/her some interesting work, but how many people like to work hard ?

So we can summarize that White Box is nothing new from Innovation or Engineering standpoint as such. At best it's a new fancy "Consumption Model" similar to " Cloud "

But in the end you must understand and articulate the real value in terms of Business Objectives you are going to meet with White Box and independent NOS strategy standpoint. If you don't do it correctly, this entire exercise will go for toss anyways.

Good Luck...

Further Readings:

https://packetpushers.net/research-towards-open-disaggregated-network-operating-system-att/

HTH...
Evil CCIE

Tuesday, August 13, 2019

Important Considerations To Get Your Zero Trust/MicroSegmentation Project Right ... A Network Artist's Perspective


Recently came across an interesting blog talking about When & How " Zero Trust " idea surfaced almost a decade ago and some really good approach author has put together when it comes to approach a Zero Trust project.

While the approach seems to be good for most part, I found few small gaps and some additional consideration those needs to well thought through in order to get it right in real world.

So here is quick summary - Feel free to add and correct. Again it's my personal perspective and nothing against the original blog author. 


1. By implementing Zero Trust, you just increased the Network Complexity in significant manner (I'll save P vs. NP analysis for later :)  ) and more importantly Operational Complexity. So Author sort of didn't touch on those important topics I guess. You don't want to end up increasing MTTR and MTTI.


2. It would require a big Cultural Change in the organization to be successful. Similar to NetDevOPS and other fancy stuff.



3. The point 2 and 3 seems to be going opposite to each other. You want every communication to be encrypted/Secure and you want to inspect every thing too. Probably no easy way to do that in real life.



4. Every encrypted communication will add into performance degradation probably. Even considering CPU and Memory are not issues any longer. You run into other fancy issues such as MTU, MSS. Having multiple hops involved for encryption/decryption for Inspection would add significant delay, Expose it to man in middle attack and breaks end to end communication flow. And never underestimate the madness things like NAT can add into this.



5. Convergence becomes a challenge. (Networking Convergence is not = Application Convergence)



6. How this model map to Overlay Networking is interesting area to get head around and think through. (Stitching those policies across Campus, WAN and DCs needs to considered too as most vendor solutions in these spaces are pretty much black boxes)



7. Is your NMS ready to Monitor such Network and Network Constructs ?



8. How do you map this model to Telco Services and SLAs will be worth taking a look



9. For application dependency mappings you need to invest into APMs. A pretty big investment usually I guess and takes good amount of time to not only deploy it but getting it right.



10. My Fav One - Are you solving the right problem to begin with. It's not only about doing right things but more importantly doing things right. :)

11. Impact on Customer Experience. Most organisations don't even want you to touch that area in case there is any impact....even if it's little.



It goes back to basic principle of Computer Science around State, Surface & Optimization. The Author seems to be more focussed on only single dimension of Surface (Though Surface itself has many micro areas to touch upon)



Maybe good time to look at OODA loop for Cyber Security ?



What would the governance model , business case to get funds will look like and how you would measure the success of such project ?


And never underestimate RFC 1925 rule 8 :)

HTH...
Deepak Arora
Evil CCIE

Wednesday, May 1, 2019

AWS Certified Solutions Architect - Associate (SAA) - Exam Review & Recommendations For Network Engineers



This exam review is written for Network Engineers in mind in order to help them with mindset required for the exam. If you are an existing Cloud Practitioner or DevOPS Engineer you probably wouldn't need my advice anyways. 😇

Study Resources I used for Preparation

- A Cloud Guru AWS Cloud Practitioner Video Course - Since this was the first time I started exploring AWS itself. I went through this quick course to get the required background and get familiar with AWS.

- A Cloud Guru AWS SAA (2018) Video Course that also includes practice exam for each section and a full length practice exam at the end. The video course is definitely good to get started and understand the AWS services in general with some nice demos on the console. Strongly recommended to lay the good foundation for exam prep.

- AWS Official SAA Certification Guide - It's not updated for current exam and is probably 2-3 years old. But still serves the purpose to prepare for some of core topics of exam. 


- AWS Live Lessons - Good Video Training Course to compliment A Cloud Guru Video Training Course. Some of topics are better covered.

- Free Practice Exam From Whizlabs - You can actually go ahead and buy paid ones as well. But I came to know about this few days before exam only. So only attempted their free exam and it was good experience. What I liked about them that they also give you explanations in case you get something wrong.

- AWS Recommended White papers and FAQs. - White papers are really important to go through cover to cover. FAQs are important but too many, I would recommend to at least go through Compute, Storage & Database FAQs.

- Practice on Live AWS Console using Free Tier Account - Just make sure you set up billing alerts to ensure you don't end up spending a lot of money from your credit card as people often forget to delete the configuration after practice sessions using free tier accounts. 

The Beginning & Initial Challenges
As I mentioned earlier, This was my first step to learn more about cloud and AWS in particular. Since I didn't have enough background on AWS Services so I started with A Cloud Guru AWS Cloud Practitioner Video Course. The course is quick and short introduction to give you high level perspective on AWS services in general. Some of the topics covered in course are unique in the sense that those are not covered later in AWS SAA Exam Blueprint. 

Later I went through A Cloud Guru AWS SAA Video Course 3 Times. Why 3 Times... I'll tell you later. :)


The SAA Video course is good to gain intermediate level skills and each topic is covered in much more depth compared to Cloud Practitioner course. At the end of each section they have got practice quiz which are good to test you skills on a given topic. I pretty much scored 90 or above during all practice sessions in first go (After 3rd reading). Also scored 94 in full length practice exam in first go. Interestingly last 3-4 sections in the video course don't have any practice exams such as Application Services, Well Architected framework etc. if I remember correctly. Also you must try to go through all Demos that instructor walks you through on Live AWS Console on your own.

While Practice on the live console is really good way to get familiar with services. In the exam in particular you didn't get any questions around Configurations itself such as Configuration snippets to be verified, Steps to configure a given service or scenario etc. Or at least that was the case in my exam.

2 Weeks before exam I also purchased AWS Live Lessons on one friend's recommendation and quickly went through it. The course gives you fresh thoughts and few new perspectives. Some of the topics IMHO are better covered compared to A Cloud Guru Video Course. 

There are couple of Challenges I faced during the exam prep as follows. So you must have a plan to get rid of those.


1. The Cloud Practitioner Mindset - This is probably the toughest challenge to overcome for Network Engineers. There are couple of ways to looks at it. Adopting the mind set is probably the one part of it. But personally IMHO the other challenge is the assumption that most of self paced Video Courses have is that they assume you are an Application Guy. The SAA exam is 95% about Application and only 5% around stuff like Networking, Load Balancing & DNS etc theory that we as Network Engineers are mostly familiar with. I remember A Cloud Guru instructor Ryan mentioning at the end of VPC section that " This is by far probably the most difficult section for you guys" and I was like " Dude it was so easy and only section I understood 100%" :)

That's the reason I have to go through the Video Course 3 times. In the first run it's too much of new information around Application stuff you would need to get familiar with. Also details like different flavors of storage, compute etc and how pricing is done for each offers you too much of information to remember. So In first run I probably only understood and can remember 50% of stuff. To overcome this challenge I started taking notes in 2nd run. But it was very time consuming as you often got to pause the video and start writing. 


2. Learn Lot more about Application - This is another great challenge. It might be easier to overcome for people with Computer Science background from university education background perspective. Initially I thought I exam must be more IaaS focussed but I was completely wrong here. The exam is not about general Compute, Storage & Virtualization discussion. But more focussed around deep level storage understand, life cycle management, Event Logging, Monitoring , Different Types of Databases and Database Scalability techniques etc. Which means as a Network Engineer you got to spend lot of time researching to get more and detailed understanding of such topics. This is where none of Self Paced video training vendor met my expectations in particular including official exam certification guide. Since they only teach you very fundamental skills while exam requires much more advanced skills. 

For example a Database Scalability problem can be solved in many different ways - Scale Compute, Spin New Instances, Use Auto Scaling, Go multi AZ, Spin New Read Only Instances, Run Traffic Distribution with DNS and LB Techniques, Move to Non Relational DB, Using DB Caching Techniques, Use Database Acceleration Techniques etc. As you can see if you get a question in exam around Database Scalability issue, you really got to think through all these option based on what kind of scenario is given with specific requirements and current landscape including keeping in mind details like if given requirements are talking about a problem with read only scalability, read write scalability or write only scalability including scenario where it's queries that are taking longer to respond. Also it could be around data coming from multiple streaming resources such as IoT kind of setup or maybe online gaming. So as you can see there is could be lot of things going in parallel in exam. Now throw things like lambda in the mix :) 

3. Outdated Exam Certification Guide - The official exam certification guide is very outdated except it still helps with some core topics like EC2, Storage, DNS. If I were to study today using book, I would rather pick the following which seems to be updated for new exam. 

4. Practice Exams - None of full scale practice exams included under Self Paced Video training courses were anywhere close to the real exam in terms of complexity. 

5. Study Group - In past for all my Cisco CCIE and CCDE exams I use to run study groups. But this time the idea didn't work out for many reasons. Different Time Zones, Different Backgrounds etc were causing issues. But personally I would still recommend to create one if possible and throw challenges and scenarios on each other.

The Exam Day

In exam you gonna get 65 questions that to be answered in 130 Mins. Personally I think that time is more than enough if you are well prepared. I finished my exam in 90 mins or so. I spent 15 mins to review some questions and change my answers for those I tagged for review earlier. This was bit different from Cisco exams where you are not allowed to go back and make change or revisit a question later.

Some of the questions will have lot of text to read though covering requirements and current landscape & expected outcomes. I used A4 sheets given to me by exam center to note down key requirements from entire text to ensure I stay focussed and don't have to read through the entire question few times. They don't however allow you to highlight it on the screen itself like Cisco CCDE lab exam. IMHO that would be cool though. 😎

The exam result be flashed momentarily at the end of exam and they don't show score right away. It took them around 24 hrs to post my score and certificate under my aws certification account online. You will likely only see a Congratulation message or a Sorry message when you end exam. Which means what is easy to figure out 😋

EC2, Storage, Database, DNS & Lambda itself covered 75% of my exam. Misc. topics such as Application Services, Cloud Trail and Cloud Watch, Beanstalk & Cloud Formation were also on exam. 


HTH...
Evil CCIE / A Network Artist