Cover image of Yan Cui

Yan Cui

14 Podcast Episodes

Latest 28 Jan 2023 | Updated Daily

Episode artwork

Yan Cui: AppSync, VTL, and Creating Tech Content in 2021


Yan joins Adam to discuss their shared love for AppSync & VTL and his experiences as a thought leader and content creator in the serverless space.

1 Oct 2021

Episode artwork

Yan Cui: 10 questions about cloud

Roopu Cloud's Podcast

In this episode, Yan Cui answers 10 questions about cloud.Yan is an AWS Serverless Hero and the author of Production-Ready Serverless. He helps organizations go faster and deliver more with less.Imagine if your feature velocity goes from months to days, and your systems become more scalable, more secure, more resilient, AND cheaper to run!MEET YAN CUI➡️  Twitter: https://twitter.com/theburningmonk➡️  LinkedIn: https://www.linkedin.com/in/theburningmonk/➡️  Youtube: https://www.youtube.com/channel/UCd2PaRjI5iAGgeld3lCFPNg➡️  Podcast: https://realworldserverless.com/➡️  Github: https://github.com/theburningmonk➡️  Website: https://theburningmonk.com/MEET PABLO PUIG➡️ LinkedIn: https://www.linkedin.com/in/pablo-puig-433295171/PODCAST "10 QUESTIONS ABOUT CLOUD"➡️ Web: https://roopu.cloud/podcast➡️ Spotify: https://open.spotify.com/show/4kH3z7x0Eydh1lBlvncLyQ➡️ Apple Podcasts: https://podcasts.apple.com/us/podcast/roopu-clouds-podcast/id1539635929MEET ROOPU CLOUD➡️ https://roopu.cloud#cloud #cloudcomputing #podcast #roopucloud


28 Mar 2021

Similar People

Episode artwork

#33 - Yan Cui Returns!

Talking Serverless

Yan Cui is back for a second round! Ryan Jones sits down with the AWS Serverless Hero, author, and podcast host to discuss his AppSync Masterclass course, no-code vs. low-code, the abstraction of CloudFormation, and much more. Dive into the rabbit hole of serverless with us on this loaded episode that you won't want to miss.   If you like this podcast and want more content, visit our website where it's all there and always free: talkingserverless.io   You can find more from Yan Cui on his website and be sure to give him a follow on his Twitter: @theburningmonk --- Send in a voice message: https://anchor.fm/talking-serverless/message


11 Feb 2021

Episode artwork

Yan Cui: On Becoming an AWS Servlerless Hero

Exploiting with Teja Kummarikuntla

What does it take to be consistent on a certain technology and build a career with valiant efforts?  Yan Cui, AWS Servels Hero, Developer Advocate at lumigo.io and helping companies around the world adopt serverless as an independent consultant and the host of the Real-world serverless podcast shared his journey behind becoming a burning monk in the cloud ecosystem Book: Serverless Architectures on AWS, Second Edition | Manning Publications Guest: Blog: https://theburningmonk.com/ Podcast: https://realworldserverless.com Masterclass: http://appsyncmasterclass.com LinkedIn: https://www.linkedin.com/in/theburningmonk/ Sponsors: Sundog-Education: https://sundog-education.com Manning Publications: http://mng.bz/RM0v


30 Nov 2020

Most Popular

Episode artwork

Yan Cui on Serverless Orchestration & Choreography, Distributed Tracking, Cold Starts, and more

The InfoQ Podcast

Today on the InfoQ Podcast, Yan Cui (a long time AWS Lambda user and consultant) and Wes Reisz discuss serverless architectures. The conversation starts by focusing on architectural patterns around choreography and orchestration. From there, the two move into updates on the current state of serverless cold start times, distributed tracing, and state. Today’s podcast, while not specific to AWS, does lean heavily on Yan’s expertise with AWS and AWS Lambda.Listen to the podcast for more.Curated transcript and more information on the https://bit.ly/2YPOLj5Follow us on Facebook, Twitter, LinkedIn, Youtube: @InfoQ Follow us on Instagram: @infoqdotcomStay informed on emerging trends, peer-validated early adoption of technologies, and architectural best practices. Subscribe to The Software Architects’ Newsletter: www.infoq.com/software-architects-newsletter/


31 Aug 2020

Episode artwork

Yan Cui at Frontend Developer Love 2020

JSWORLD Podcast by Passionate People

In this episode, we hear from Yan Cui, also known as the “Burning Monk”. Yan currently works as an independent consultant but is well known as a great speaker, trainer, and as he describes himself, an AWS serverless hero.--- Send in a voice message: https://anchor.fm/jsworld-podcast/message


3 Jun 2020

Episode artwork

Episode #33: The Frontlines of Serverless with Yan Cui

Serverless Chats

About Yan Cui:Yan is an experienced engineer who has run production workload at scale in AWS for 10 years. He has been an architect and principal engineer with a variety of industries ranging from banking, e-commerce, sports streaming to mobile gaming. He has worked extensively with AWS Lambda in production, and has been helping clients around the world adopt AWS and serverless as an independent consultant. Yan is an AWS Serverless Hero and a regular speaker at user groups and conferences internationally. He is also the author of several serverless courses. Twitter: @theburningmonk Blog, Courses, Workshops: theburningmonk.com GitHub: github.com/theburningmonk Transcript:Jeremy: Hi, everyone, I'm Jeremy Daly and you are listening to Serverless Chats. This week I'm chatting with Yan Cui. Hi, Yan. Thanks for joining me.Yan: Hi, Jeremy. Thanks for having me.Jeremy: So you are a developer advocate at Lumigo. You are an AWS serverless hero, you are also an independent consultant and I think more people know you as the Burning Monk. But why don't you tell us a little bit yourself and what you've been up to lately?Yan: Yeah. I'm all those things you just mentioned. I'm doing some work with Lumigo as a developer advocate where I'm focusing a lot on the open-source tooling and articles and in my sole capacity as an independent consultant I also work with a lot of clients directly. A lot of them are based in London where I used to be based. Nowadays I moved to Amsterdam. I still do a lot of open-source work. I just started a new video course focusing on Lambda best practices. Then I'm also doing some workshops around the world. In Europe and also now looking at U.S as well. So doing a lot of different things to keep myself busy.Jeremy: Awesome. Listen, I can talk to you probably about anything. Anybody who knows or seen some of the work that you've done, it's quite expansive. It's very impressive. In 2019, I have some numbers here, you did 70 blog posts, something like 2200 students to your video courses. You spoke at 31 conferences in 17 cities. But more importantly, you helped 23 clients in 11 different cities. So you are on the front line here in seeing how companies are adopting serverless. And not from one perspective and I think that's what we get a lot from different companies is, there is one perspective of how they adopt serverless and how they are working with that.You've obviously seen this from multiple perspectives, so just, I want to talk about adoption a little bit. We'll talk about a few other things, but just what are you seeing with companies now? The customers you're working with or the clients you're working with, what are they using serverless for?Yan: They are using it for all kinds of different things. I think depending on, I guess the maturity of the company, the domain they're working in. I've got a lot of clients that they are either enterprises or a lot of small and medium sized enterprises, and even some stealth-mode to startup as well. And obviously your constraints are completely different. That's one of the things I really enjoy about being a consultant. Where I get to see a lot different perspectives and what may work for one company may be completely inappropriate because the constraints a different company would have. So in terms of the adoption patterns, you see a lot of the, I guess startups that are in that position where they can go all in on serverless.They are your great serverless-first going to the game. But then at the same time, you also have lots of, I guess midsize companies and enterprises. They have so much existing intellectual properties that it wouldn't make sense for them to rewrite everything just so that they can run code on Lambda. For all those companies, you see a mix of Greenfield projects. They are serverless first and then at the same time there's some effort to migrate some of the existing projects to work on serverless at least to some degree, at least gradually. Of course, depending on a lot of constraints around how much of on-premises stuff you have.Do you have to run everything in Java in which case it is the cold start performance that's a concern. So a lot of those limitations I guess affects how quickly and how much you are able to go in on this whole serverless first mindset that we like to have and I think that is probably one of the reason that serverless adoption hasn't been as fast and as many people expected a few years ago because, the fact that, you can't just lift and shift your weight anymore. It means that you always have to allow more thought process behind it and planning and also just risk involved if you make a big mistake and it's your flagship product and of course that's going to put you in a really difficult position. But we do see that companies of all sizes and all fields and all industries are adopting serverless for lots of different workloads, not just APIs but a lot of data processing, IoT, you name it.Jeremy: That's actually one of the things I'm curious of too. You mentioned customers in all different industries, which is really interesting. Because we get to this point now where I think every company is a software company. Everybody is building some sort of software now. But so, what are the constraints that these companies are working in?Yan: A lot of them, I guess again, it depends on the industry you're in. For finance companies you have to be very careful about a lot of the, I guess, regulated requirements. In terms of how you handle data and also in some cases you having a plan in case you have to move away from a database for example, that's where some of your vendor lock-in arguments start to kick in. And also for example, you have enterprises who have millions lines of Java code that has been accumulated over 10 years. It's not possible for them to move everything into Lambda if they're seeing one to three seconds of cold start time on those user facing APIs. So some of those constraints are being lifted.At least they are now getting better with new features on the platform but still it's something that people have to be aware of and also have to understand the mitigation strategies, which a lot of times is where the constraint is, is a lack of knowledge and knowhow because you can even think of Lambda as the extension to a lot of AWS offers, then it means that, you can't just know is it to visualization, you have to know a lot of different services to take full advantage of serverless.That's where I think a lot of companies are struggling, is that they just don't have the skillset available in-house. They're exposing developers to things that they've never had to think about before. And I think that's where you get a lot benefit from serverless from having autonomous teams that can be self sufficient and look after so many different things, but at the same time, a lot of developers are just not used to working that way. They're used to working in silos where they have very few responsibility, just write your code, someone else will manage running the code in production. They'll manage the infrastructure, but now more of that is your responsibility which can be a gift, but it can be a challenge to companies that are not used to working that way as well.Jeremy: Yeah, I totally agree. And I think that, as you mentioned learning all these other services, I think we're at a point now where most of these use cases there is some sort of serverless equivalent or serverless alternative to doing it in a more traditional way. Obviously, we're still missing certain things like, I'd love to have some sort of serverless Elasticsearch for example which would be really nice. Are there certain applications that you see people are trying with serverless or are thinking about serverless and just say, "No, I can't do it." Because the throughput needs to be higher or there is too much latency or something like that?Yan: Yes. You see cases where in say for example, one of my clients had a very complex microservices environment whereby they have so many API to API calls and the fact that you get cold starts on one function that may not be an issue, it may not affect your 90% or whatever SOA is set. But when they start to stack up, that becomes a massive issue. So having more control around the warm up process, provisioned concurrency should help with those things. But at the same time, that is a slow process. Having to get the teams educated on what these different features are, how to work. In fact, a lot of questions I get are fairly simple questions around, how do I even do a CICD? How do I do testing? It's not clear to a lot of newcomers how do you do these things?A lot of what we've been taught has been tethered to, there's going to be a server, I can just run everything locally and I press F5 and I can just run a local HTTP server and now everything is running in the cloud. A lot of that mindset change, it needs to happen. Those kind of paradigm shift happens gradually because well, everyone learns in a different pace and you need to have some critical mass in the industry.Jeremy: Yeah. I like the idea of provisioned concurrency actually, because I do think it does solve a problem for the right types of applications, especially when there's low latency requirements, that it helps. I think that AWS has been pretty good about addressing those problems. They've come out now with the RDS proxy, which is helping with connections to the to relational databases. But I always feel like when that happens, they have to add another service in order for you to make it work. It's not just Lambda functions. "Hey, we've solved the connection issue with Lambda functions." It's, "We've solved the connection issue with Lambda functions because we've added a new service that now you have to use." And I think those present a number of roadblocks. And you had mentioned education as being one of those. So what are some of those other roadblocks that you see companies running into?Yan: Well, the biggest one I feel is by far is just education. Like I said, Lambda itself is getting more and more complicated because of all the different things you can do with it. Other roadblocks includes for example some organizations are still holding onto the way they are used to operating. With centralized operation teams, cloud teams. The feature teams don't necessarily have the autonomy they need to take full advantage of all these different tools that you get and all these power and agility you get with serverless, your team can build a new feature in a week, but it's going to take them three weeks to get anything they need done, provisions and to get assets to resources they need. Then again, you're not going to get the full benefit of serverless.So a lot of that legacy thinking at the organization is still there and is still a prominent problem and roadblock for people to take full advantage of serverless. But in terms of actual adoptions, a lot of it is ... In terms of technical roadblocks, there's some, I think the last question you had was around some use cases that just doesn't fit so well. When you've got a really high throughput system, the cost of serverless can become pretty high. So imagine you've got something that's relatively simple, but how to scale it massively like your Dropbox, not a super complex system, but have to scale to massive extent. So for them it makes perfect sense to move off of S3 and start to build their own given hardware so that they can start to optimize for that cost.For a lot of companies, they do have that concern as well. They may not have a very complicated system that requires a hundred different functions on this massive event driven architecture, maybe they just have five end points. But those five end points are running at 10,000 or 50,000 requests per second. So in those cases, the cost for using Lambda and API gateway would be excruciating and you'd be much better off paying a team to look after your community's cluster or your containers cluster than have them running them on Lambda.But that's always a tricky balance. Because, oftentimes you can always get the reverse argument whereby, "Well, Lambda is expensive, so I'm going to just do this myself." But then you're hiring someone for $10,000 a monthJeremy: Exactly.Yan: ... to look after your infrastructure, and your Lambda bill is going to be, I don't know, $100.Jeremy: And you're hiring more than one person too. Then you're still paying for the bandwidth and some of these other things and you're still paying for compute somewhere. So that's really interesting. You made a point a little bit earlier where you said, this idea of the paradigm shift or the mind shift of going from this traditional lift and shift and bringing things into serverless. And so obviously there is a ton that needs to change. We'd like to say, it's programming and you just need to figure out the glue that works there. But you really can't just lift and shift and get those benefits, right?Yan: Yeah. It's a common pitfall whereby the teams try to lift and shift. And initially it looks like it might work and then later on, pretty soon they found out the hard way that, that approach doesn't scale, it doesn't work nicely. And you run into all kinds of different limitations. For example, one example from a client I've worked with that can be used to illustrate that point was, they had this API which used to do lots of different things, including doing some service rendering and some API endpoint, some penalties, some requests, and then they just moved the whole thing as one big fat Lambda function.Because one of the end points have to access some VPC resource, so of course [crosstalk 00:12:59] and now you've got, every function have to ... when cold starts have to initialize React, which is not a very lightweight dependency. Even when you've got HTTP endpoint it doesn't need it. You have to initialize it, and then also you have to wait for the VPC initialization and all of that, and they were getting performance that was so bad that it's just not acceptable for anything that's going to run in production. And of course, unless you know that the reason why that's happening is because, we've got this Fat Lambda and how the whole initial decision process works.Then you know to split your function up into one function per endpoint perhaps. Maybe at least some separation so that the resource intensive functions are separate from the other things. I like to find that you've got tools that allow you to take your express app and just run inside Lambda. They represent the easy path for people to get some of the benefits in terms of the infrastructure automation and improve their scalability and resilience with Lambda. But at the same time, unless there's a way for you to then later on do the idiomatic way of working with Lambda or having single responsibility functions, then it becomes a bit locked into the decision that the tool has made for you and it becomes harder for you then to migrate later.Jeremy: I actually think that's one of the better arguments for moving away from Fat Lambda or the Lambdalith. I think a lot of people have a ton of success with that. I have used them in the past as well. There's been times where it just seems to make sense, but certainly the bootstrap process. If you're bootstrapping something as part of the warmup phase of a Lambda function that isn't used by 90% of the code, it's only used for it. It's a complete waste of time and memory to boot those things up. So, I totally agree with that. I think you and I were talking in the past too where, we just said, this adoption pattern here, this is just something that is going to take time. I know you're a big fan of functional programming. I'm actually a big fan of JavaScript functional programming, which I think people think is impossible, but it is. But anyways it's something that is probably just going to take a little bit more time for people to understand working in this different way.Yan: If you look at where we are with functional programming, it's as old as OO, but when functional programming is going to hit the mainstream guys, TBD to be decided which is to some extent is frustrating because, for many use cases functional programming is probably better tool. I'm a big fan of F sharp and done a lot of things in the past with [inaudible 00:15:46] and stuff as well. I'm a big fan. But there is a big mind shift, change the you problem solve and it's not. Mind change doesn't happen overnight, and have you have to be patient, and you have to give people time to digest and internalize this change and really understand the benefits before they become advocates themselves or at least they become practitioners.I do think it is happening slowly. Just judging by the amount of inches that community is showing and the number of serverless conferences around the world, their interest is definitely there. But we still are a long way from having enough people who are well equipped to succeed. There are definitely a lot of people. You Michael Hart, your Ben K. But we need a lot more of them.Jeremy: I agree One other thing on functional programming, I tell people, "Listen, once you write a pure function, you'll never go back to writing something else." Anyway, one more question on the adoption side of things. Because one of the things I see quite a bit, I really love this use case for serverless, is sort of this peripheral enhancing the DevOps sort of stuff. Do you see a lot of that where companies are using it to either do auditing or doing DevOps automation, things like that?Yan: Yeah, tons. There actually have been quite a few companies who, their main feature teams are not using serverless, but their infrastructure and their DevOps teams are using Lambda very, very heavily. Whereby before, there's just a lot of things they couldn't do, because there's no way to tap into what's happening in your diverse ecosystem. But now with Lambda, everything that's happening in your account gets captured by CloudTrail, you can use. Or Eventpattern or Eventbridge or CloudWatch events to trigger your functions to do all kinds of automated detection for changes that you don't expect to happen, to security checks and things like that. Or even just basic things like automating, doing some processes and resources that are no longer necessary.There's tons of things that a lot of DevOps teams are doing now that they would have been really difficult to do in the past without Lambda. And I do see a lot of adoption in that particular space as well.Jeremy: Awesome. All right. So I want to move a little bit past use cases, but I think maybe this ties into it. There are people who say, "Well, I can't use Lambda because it only runs for 15 minutes and I have ETL tasks that need to run longer jobs." Or, "I needed to do something like that." Or, "I have to have multiple jobs running together." Or something. And this new thing that seems to maybe have been sparked by Google Cloud Run, is this idea of serverless containers. I spoke with Brett McGowan about this and just the thinking behind that. And so obviously we have Fargate with AWS. So what are your thoughts on this idea of expanding the term of serverless to include things like Fargate and Cloud Run?Yan: Well, listen, when I think about serverless, I don't think about specific technologies. I think in terms of their characteristics a technology has, in terms of the pay-per use pricing model, in terms of not having to worry about underlying structure and how it's going to scale and all of that. I think right now, Fargate is serverless in that, you don't have to worry about underlying infrastructure. There is two instances that your containers run on or the cluster, how to auto-scale them. But I guess what is missing right now is just the event programming model and the fact that this is not pay-per.Jeremy: Pay-per use.Yan: Pay-per use, yeah. But that's that. I think you will too get a lot of benefits that we enjoy from serverless technologies with Fargate already and it does eliminate a lot of the limitations that Lambda has. Also, I just don't think that Lambda is going to be ... we should not see Lambda as the silver bullets. Nothing ever used is going to be a silver bullet. So the fact that you've got something else that can allow you to run containerized loads very easily and minimize the amount of work that you have to do. Because, remember the whole thing about serverless is about removing undifferentiated heavy lifting. And a lot of that is around managing EC2 instances, configuring auto scaling groups and clusters and all of that. And the fact that you can get a lot of that away from my plate onto AWS with Fargate, and I think that is really good direction.I'm not a purist in terms of the terms, all I care about is, what can I get from a technology? And from that particular standpoint, Fargate is quite close to what we get with other similar services. It Just would be nice if you can trigger Fargate with event triggers directly.Jeremy: That's the big thing too. I think Tim Wagner has said this as well, where it's sort of like Lambda and Fargate are becoming closer and closer. For all intents and purposes, there's no reason why Linda can only run for 15 minutes other than it's a limit that AWS set. I mean they could run for an hour or 10 days if they are needed to. If they wanted to allow you to do that, they could add some sort of event triggering or some sort of event driven approach to Fargate. I mean you can start Fargate tasks now in a number of different ways. So there's a little bit of event driven, just not as clear as the Lambda stuff. As Lambda gets more of these server full type features and as Fargate gets more of these serverless features, is there maybe a point that they become the same thing?Yan: Probably and hopefully. I think at that point, it'd be really confusing for people. But I think that that is ultimately where I hope we will get to. Whereby a a lot of the limitations that we currently have with Lambda is eliminated and a lot of benefits that we enjoy from Lambda but not available for as Fargate becomes available for Fargate. So it becomes more of a a choice in terms of, "Okay, what do I prefer working with? Do I have specific use cases that fits better with a containerized environment where I have more control of the infrastructure itself?" Then I use Fargate versus using Lambda. But in both cases, I can enjoy pretty much the same benefits. I think that would be a really good place to be.Jeremy: Awesome. All right. So one of the things that I've been talking about a lot at the end of last year and it's something that I've been thinking about for awhile, is this thing that I call abstraction as a service. And it's probably an annoying term, but what I'm thinking of is, Lambda functions themselves are pretty easy. You created a Hello World one, fine, simple. You want to add an API gateway, you use the Serverless Framework or use SAM, it makes it very easy for you to get these simple examples up and running. But start adding in SQS queues, or EventBridge and Kinesis Streams and then understanding the sharding of Kinesis streams and how many concurrent Lambdas that you might need to have, and then the new ability for you to replicate the stream.There's just a whole bunch of things that are happening there. And now suddenly your simple serverless upload a piece of code that is now completely dwarfed by the amount of configuration files you have to write and the understanding of all these different best practices. My sort of premise here or what I'm hoping to see, and I think this is something that serverless components are starting to do. And to a degree, the serverless application repository is starting to do is encapsulate these use cases or these outcomes and put them into something that's much more easily deployed.Where you don't have to think about the 50 different components you might need to configure under the hood. You just say, "I want to build an API or a web hook that does this and that and whatever." And it's much easier to configure that with same default. So, we can talk about the serverless components thing. But really what I want to do is focus on the serverless application repository. Because you've done a bunch of apps. I think you've got 10 of them in there now. What are your thoughts on SAR?Yan: I think SAR is a good idea. But the execution is still problematic right now. At least for my experience working with SAR both as a consumer and also as a publisher. So one of the things said that often stands out is that, with SAR, the integration path is not super clear. For example, as a consumer, to use SAR in my CloudFormation template is not just normal CloudFormation resource, this host servers application. It's not a native CloudFormation resource type, so you have to bring in the SAM Macro even if you're not using SAM. A few times when I had to do that with the server framework, it was just fine. I can bring in the SAR Macro, but it becomes a bit weird. And also AWS often talk about this idea of we should be doing lease privilege as a default, but then they want you to also just use a package, their profile, it's policies for your SAR applications.Which means that, your application either have no enough permissions or have too much permissions. It's really hard to size and tailor your permissions to follow this privilege. But when you do the right thing, the discoveries in the console punishes you because someone had to take a box to find applications that are using custom IEMs which they're trying to do the right thing to give you lease privilege. Also I find a lot of the discoverability itself is also not that intuitive to use. When you trying to search something, it's giving you way more things than you're actually looking for.If you look at some of the top applications in SAR right now, they're all Hello World or introduction to basic Alexa Skills example. There is a lot of example codes you can deploy to your account to have a look at how someone else is doing Alexa Skills as opposed to something that is actually truly useful. What that tells me is that, the AWS customer just don't really know what they can do with SAR.Jeremy: Do you think it's a lack of incentive for people to publish those apps?Yan: Part of that is that and Forrest wrote a blog post a while back where he argued that SAR being a marketplace of sorts, should be incentivizing companies and publishers in terms of putting out something that is not just a toy or example Codebase, but something that as a company, as an enterprise, I can actually have confidence. The point is being into my real production environment and know that it's been looked after. When there's issues, someone would actually be there to fix it and patch it rather than leaving me in a ditch. Because all I need is one experience like that to never want to touch anything in SAR ever again. So having some kind of a scheme where publishers can be financially rewarded by the resources that I will provision into my account, so AWS bill me for those resources so some of that learning can then be passed onto the publisher for the SAR application.That way you hopefully would encourage more commercial companies to start to publish things that are commercially looked after, adhere to SOAs and guarantees your large enterprise customers will be willing and comfortable to actually deploy into their environment. The same way that when you look at AWS marketplace, where I'm buying some software that deploys EC2 instance, at least I have confidence that this is a commercial thing. It's not just someone's toy project that they may not look after when they find something more interesting to work on.There's a bit of an image problem there for SAR in terms of what does it represent to the consumers. And if we want people to have faith in that, then we really need to do something about that. I think commercialization is one step towards that.Jeremy: I wonder about that too, because I read Forrest Brazeal's post as well, and I thought that made a lot of sense. You have other open marketplaces or other open ecosystems. Just think about NPM for example. People use NPM packages all the time with probably no consideration as to how well some of them are maintained. So you already have people using those and running into certain problems like that. I think maybe because it's so specific to AWS and maybe it just doesn't seem as open source as something like NPM does in a sense. But I totally agree. I just don't know. Do people pay for some of these apps or are they paying more for the support of them?Yan: I think that's an interesting point about NPM. But what I will say is that, the impact that a badly written SAR app can have on your organization is probably far greater than an NPM package. Because now you're talking about resources that are provision to AWS environment where they can ... If a malicious access for example, might be able to gain access to way more things, than say someone who's published a malicious NPM package of course can do a lot worse. We fear those dependencies too. Also a bad [inaudible 00:29:43] application can also just cost you a lot of money. Imagine you have someone deployed something to VPC with net gateway and start charging you 4 cents per gigabyte of data transferred, and then those-Jeremy: Get expensive quickly.Yan: ... can get very expensive really quickly. I think in terms of the impact it can be much greater. I will think twice about deploying something to SAR, whereas with NPM it's often just okay.Jeremy: Maybe it was one of those things too, because I think you're right. You're deploying something that is actually going to cost you money directly. So you have some other costs of auditing and some of those things you might do with NPM packages, but certainly with this, you're deploying things into an account that could rack up serious bills. That might be one of those other things where SAR needs to go down this road of helping people understand exactly what types of resources they're provisioning and maybe cost estimates and things like that that could potentially help ease someone's mind. But I do agree. There needs to be more people flooding that marketplace with good tools that they can use, and without having some sort of backing I think that's kind of tough to achieve.So speaking of other tools and other things that are available, the ecosystem that we have now for serverless frameworks and not serverless framework, but frameworks for serverless, I should say. Serverless framework being one of those, obviously SAM architect, Claudia.js. There is a lot of them now. There's ones for PHP, there's ones for Ruby on Rails, there's all kinds of these frameworks popping up. Pretty much every single one of them is doing the exact same thing.It's taking some level of abstraction and compiling it down to CloudFormation or making a bunch of API calls to AWS. What are your thoughts? I know you're a big proponent of the serverless framework. You've done a ton of Serverless Framework plugins, and I know that you've done a lot of work with SAM as well. So just what are your thoughts on the overall ecosystem? What should people be using?Yan: Personally I prefer Serverless Framework and I'm happy to go into details on why I think Serverless Framework does well compared to a lot of the other frameworks. I think that the Serverless Framework the biggest strength it has is the fact that it's got a great ecosystem of plugins that have support from the community. Pretty much anything that you run into, there's probably a plugin that can solve that problem for you or at least make your life a lot easier. Even when that's not a case, it's really easy for you to write a plugin yourself. I guess I'll complain about their documentation on how to write a plugin. I think the only two articles they have there is still from Anna from I think three years ago.But once you learn what a plugin looks at like, it's fairly straightforward because you can do so much different things. You can make API course as part of the deployment cycle. You can transform the CloudFormation template. With SAM, it does a lot of things right out of the box, but the problem I have with SAM is that, when you don't agree with the decisions that SAM has taken, it's really hard for you to do anything about that. One time I was working with a client and we were using SAM and that's when SAM just introduced the IAM authentication for API gateway. But they were also changing how the IEM permission was passed through. So as a caller, I need to have the permission to evoke the function as well as the end point, which of course didn't make sense, it breaks obstructions and all of that, but there's no way for me to get out of that.The only way I found was, I actually wrote a CloudFormation macro, deploy that and then change the template that SAM generate just to fit to that one tiny little thing. This is where having that flexibility gives you default like everybody else who is trying to do as well. But at the same time, give you a nice way out. I guess when it comes to framework, there's also this new CDK and [inaudible 00:33:59] which is a whole different paradigm where this lets you program with your favorite programming language. I have to say I'm not a fan of this approach. I think I can get the temptation that, "Oh, I like writing stuff in C Sharp, I like writing stuff in Java script and now I can use my favorite language to do everything."But your preference to the language that you want to write, I don't think that should be very high in the list of criteria for choosing a good deployment framework. Things like, the fact that you can get the same infrastructure you have to reason in different ways, I think that is quite a dangerous thing. You can end up with arbitrary. The complex things that would have been a lot simpler if everyone just writes something JSON or YAML. That said, I do wish there's better tools for YAML. I see so many people struggle with just basic indentation problems. It happens all the time.For me, I came from F Sharp and Python as well. [inaudible 00:35:07] methods. I kind of learned that, but most of what haven't. You have to be trained to look out for these kind of problems, and we do need better tooling support for YAML. That said, I still think YAML or something like that is a better way to define your resource graph compared to writing a code to do that. I remember before all these frameworks, I was writing bash scripts to provision everything and now I'm just substituting bash with C Sharp or a prettier looking language. And I don't think that's the right approach.Jeremy: I actually agree with you on the CDK stuff. I know some people are huge fans of it and they like the idea that you can build these constructs and then you can build higher level constructs that wrap a bunch of things together. And it is kind of cool that you can encapsulate some of that stuff, but I do feel like there is that black box issue there, and maybe with Winston Churchill who said that, democracy is the worst form of government except for every other form of government or something like that. I would probably say YAML is the worst form of configuration language except for every other form or every other configuration language.All right, so the serverless framework, they just came out with the Serverless Framework Pro. And I know you've kind of experimented a little bit with that, but what are your thoughts on that? Now that they've added things like monitoring and CICD and some of that other stuff?Yan: I think it's a nice tool for someone who's new to serverless and just wants to have something that they can use. But it's certainly from the monitoring perspective, I don't think the Serverless Pro holds up compared to other more specialized solutions that offers monitoring and tracing that you can tell are done by people who are spinning this view for a very long time and understand this problem space. What I find with the Serverless Pro offering is that, it gives you a lot of basics, but it doesn't do much more beyond what you get with CloudWatch.So as someone who's got a lot of experience with AWS and have used CloudWatch for many, many years, I don't see a lot of value add for me to invest into Serverless Pro. But at the same time, if I'm new to AWS and new to serverless having something that comes out of the box with the two that you need to use for deployment, I can definitely understand the temptation there. For a lot of applications, I've done where it gets complicated quickly. You've got some of the functions, lots of event triggers, lots of events flying everywhere. And I'm really interested in the tracing side of things and that's why I think a lot of tools that we have today, it's not quite there yet.Everything seem to struggle for tracing. EventBridge for a moment and also X-Ray for example, it doesn't trace through SQS properly, it doesn't trace through Kinesis at all. We get all these fragments of our transaction, but you can see that this space has been evolving really, really quickly. You've looked at the work that Lumigo has done, AppScan has done and Thundra has done. Everyone has gone through a lot improvement over the last 12 months at least. And I do see this space are getting more mature and more of the, I guess traditional big monitoring companies getting into this space as well.And also a shout out to Honeycomb as well. I think they also do a very good job with their product. It's quite a big mind shift for people who are not used to do event based monitoring. But once you have that, it's really powerful. Splunk has been there for a long time, but they kind of price everybody out.Jeremy: Listen, I think you're right about the monitoring component of Serverless Pro. It is good. I played around with it and it does tell you about your invocations and things like that, but this idea of really understanding the tracing and some of that deeper stuff is a little bit more advanced. But I will say, Serverless Framework has been great at developer tooling. That's one thing that they've done really well. And I think the greatest feature of Serverless Pro, at least for me is the new CICD deployment stuff that they've released. They've got similar to what Seed.run did with being able to use the monorepo.It's very hard to have multiple repos when you are building serverless apps especially if your services are relatively small. Sometimes that monorepo make sense and being able to just deploy changes from individual directories I think is a pretty interesting thing. But anyways. All right. What about your thoughts on this? Because this is another thing we hear all the time and it kind of drives me nuts as when we hear the term multi-cloud. And that people are trying to ... You actually mentioned it earlier where you were hedging your bets to say how easy is it for me to move from AWS to some other provider as that's something that we actually care about. Do you see using a framework like SAM as potentially locking you in even more to AWS or do you think that's a pointless argument?Yan: I think it's more of a pointless argument. Even the tools that do support multi-cloud, they have different syntax in the same way that the Terraform have got different syntax for different cloud providers but give you a consistent tooling experience when you use it with different cloud providers, but you still have to learn the different syntax. You have to learn the cloud itself. What resources is available in AWS versus what resources is available in GCP or your [inaudible 00:40:55]. Serverless framework, it does support multiple clouds but at the same time I think it's not as valuable as the people probably make it out to be. Because again, how you work with different clouds is completely different syntax and different resource type.It'd give you consistent tooling experience in terms of using SOS deploy if something happens, but it doesn't remove the fact that that's no way you're going to struggle with when you want to go from one car to another. There's been so many different blog posts on this, I've written a few of myself as well. And I think this whole multi-cloud thing is an argument about, for example, when I buy an iPhone, they take our insurance, but how much I pay for insurance versus the cost of the phone itself. If the insurance itself is going to cost way more than just getting a new phone, then why we're not wise to do that but at the same time you look at some of the vendor lock in arguments. Well, when you decide firstly it's not lock-in, you can still move things, just there's a cost to moving. They're coupling, so there's a cost of moving.You either deal with that when that scenario comes up or you try to do a lot of work upfront. So essentially investing all the work that you will have to do later to this point when you don't even know what's going to happen in the future. And the worst thing is, you end up with a lot of complexity that you have to carry all the way and everything becomes more difficult if it becomes slower. Your developers have to work so much harder to do everything as opposed to just, make a decision, go with it and knowing the back of her head that if we need to move ever, this is other things that we need to think about and we need to do. I think Nick from Serverless Inc actually wrote a really good post about the fact that, moving compute is always easy. It's the data. Data is incentivized to stay where it is and accumulate as much as possible. So it doesn't matter how easy it is to move your APIs from one container to another in different clouds. Well, we're going to deal with the data because there's no exact replica of that in Dynamo DB.There's no exact replica of the high replication data store. I think it's foolish to spend so much effort upfront to prevent something that is probably unlikely to happen. How much I spend on insurance should be proportional to the risk of my phone getting stolen, lost and also to the cost of the phone yourself as well. The same argument applies here, where a lot of this strategy is just insurance against the stolen phone.Jeremy: I think I need to start hooking my guests up to a blood pressure monitor when I ask them the question about blocking. Actually it's very funny. I think people now, and I'm the same way when somebody asked me about this. I think I maybe do it just to get a rise out of people, but people get angry now about trying to defend this vendor lock-in thing. Because I think you're absolutely right. And the biggest concern that I have where people play this vendor lock-in argument. You're locked into everything. You're locked into your iPhone, you're locked into Microsoft Word or whatever if that's what you choose to devout your time.Yan: Anything you use.Jeremy: You're locked into these things. I look at it and I say, if people are using that as an argument to use the tools or picking the technology that's the lowest common denominator, then they're not choosing the best tool for the job. I think that is something that significantly impacts the ability for people to adopt serverless because they say, "Well, if I write a Lambda function, I can't just easily move that to Azure. I can't easily do that to GCP. I need to work with all of those constraints." But honestly, I think if anything, you're just adding more work for yourself. And you're right, you're insuring yourself against something that is very unlikely to happen. And in the off chance that it does happen, I still think you're going to go through a massive exercise in order to migrate something no matter how low the denominator was that you chose.Yan: We went through all without with ORMs. There was a few years where there was ORM every single month.Jeremy: I'm going on the record. I hate ORMs. I hate them.Yan: Because when we do have to move to a database, it turns out ORM doesn't really help me. It's just another thing you got to deal with as part of the migration process.Jeremy: Something new to learn.Yan: And also it gets in your way from the start in terms of the complexity to start but also when you want to do anything, you've got to have to understand what happens under the hood but then also how to do it with ORM.Jeremy: Exactly.Yan: It's crazy.Jeremy: And the optimization isn't there. You don't get the optimizations with an ORM. The biggest thing that drives me nuts about them is that, you write a query and then you have some ORM or whatever that has to run three separate queries in order to merge the data back in the application layer because that's how it was built, where you could have just written a native query and done a join or something like that, and it would have been a thousand times more efficient. But anyways, yeah. You mentioned Terraform as well when you were talking. What are your thoughts on Terraform for serverless deployments?Yan: It's very laborious and painful. I remember on my previous jobs I was convinced by the teams to use a Serverless Framework, and all I had to do was show them a very simple API gateway endpoints with a Lambda function and there was three lines of code in the server framework. It was about 150 lines of Terraform scripts. And you can see the teams that are using the serverless framework, they just go in there and get it done. Get a feature shipped and test it and all that. Other teams would be spending next two weeks just writing Terraform script. I had engineers coming up to me to describe their job. We spend about 60% of our time just writing Terraform. When you are talking about serverless being don't do undifferentiated heavy lifting. Something is not quite right when most of your time you're just writing infrastructure.Jeremy: That's the thing too with Terraform. Terraform is a very good product and there's all [crosstalk 00:47:28] Terraform the enterprise edition has a lot of great things like safeguards and some of those other things. I think it is a very good tool for cloud management but at the same time, I think you're right, not very productive for the serverless developer.Yan: No. If I'm provisioning VPCs and networking and things like that, I'm very happy to use Terraform. It is a very good tool for that. But when I just wanted to write a few Lambda functions and hook up a few end points and have some event stores like SNS, SQS and so on, I really don't need Terraform, what I need is something that can give me good defaults and allow me to do what I need to do and get out and move on to the next thing rather than having to get bogged down with the detail of the specifics. That's just not productive. That's not useful. That's just undifferentiated heavy lifting.Jeremy: You are preaching to the choir. All right, let's move on to, where is this going? Serverless in general. This is one of those things where I think you and I would agree that, and I think you mentioned it earlier, it's like we're making it more complicated. We're adding new features, the learning curve keeps getting steeper and steeper. There are still some use cases that are not necessarily perfect for it. AWS is making advancements in some of those things. Reducing VPC cold starts, adding things like RDS Proxy and provision concurrency and those sort of things. But are there other things that are holding serverless back? Does there need to be some other breakthrough before it goes mainstream?Yan: I don't know about the major breakthrough, but I definitely think more education and more guidance, not just in terms of what these features do, but also when to use them and how to choose between different event triggers. That's a question I get all the time. "How do I decide when to use API gateway versus AOB? How do I choose between SNS, SQS, Kinesis, DynamoDB Streams, EventBridge, IoT Core. That's just six application integration services off the top of my head. There's just no guidance around any of that stuff and it's really difficult for someone new coming into this space to understand all the ins and outs and trade offs between SNS and SQS and Kinesis and so on.Having more education around that, having more official guidance from AWS around that, that would be really useful. In terms of technology wise, I think I like the trajectory that AWS has been on. No flashy new things but rather continuously solving those day to day annoyances, the day to day problems that people run into. The whole cold start thing, again, often overplayed, often underplayed it's never as good as some people say, it's never as bad as some other people say. But having some solutions for people with real problems, where with clold starts we speak of various different reasons.I really like what you've done with provision concurrency, even if I think the implementation is still, I guess it's a version one. So hopefully some of the kinks that they currently have would be solved. Other than that, I'd like to see them do more with some multi account management side of things. A control tower is great, but again, there's a lot of clicking stuff in the console to get anything set up, and it's also very easy to rack up a pretty big bill if you're not careful you can provision a lot.NAT gateway for example and things like that. One of the companies I've been talking to recently as well, a Dutch bank, they are actually doing some really cool tool themselves to essentially give you infrastructure as codes. Think of it as a CloudFormation extension that allows you to capture your entire org. Imagine I have a resource type that's defines my org and the different accounts and then when they configure CloudTrail set up for multi-cloud to configure security guard and things like that all within my cell template, which looks just like CloudFormation. So some really amazing tool that those guys have built.But having something like that from AWS would be pretty amazing as well. Because again, we've seen more and more people getting to the point where they have a very complex ecosystem of lots of different enterprise accounts, managing them and setting up the right things. The STPs and things like that. It's not easy and we certainly don't want people to be constantly going to the console and clicking things. And that's another annoyance I constantly have with AWS documentations is, they keep talking about infrastructure as codes, but every single documentation just tell us, go to this console, click this button.Jeremy: That's how you do it in the console. Exactly.Yan: What the hell?Jeremy: Yeah, exactly. I guess one of the things that I try to tell people who ask me to get into the cloud or to start building stuff in serverless is sort to do a slow migration pattern. You can't just jump all in, you can't rewrite everything in serverless and do that. Often though that does require rewriting applications. Do you see a potential path where making it easier to move those applications into Lambda or into Fargate maybe like if there was an easier path to lift and shift, would that be something you think would make sense?Yan: I think that would make sense. I guess I'll have to wait and see what kind of execution that comes from that. Because again, you're making a lot of assumptions about what people are using, what they're doing to be able to do that well. Of course if you do that, it's really easy to do them bad. I kind of think that would be great, but it really depends on the execution.Jeremy: Awesome. All right, so any other missing pieces in serverless? I think you and I agree we need some sort of Elasticsearch utility.Yan: Absolutely.Jeremy: But anything else you can think of that's maybe missing?Yan: Let's see. Nothing off the top of my head. But definitely some kind of serverless Elasticsearch that would be awesome.Jeremy: Awesome. All right. So final question here. Because now that I have you and I think that with everything that you write with the courses that you do and you're doing a ton of in-person workshops and things like that and all of your talks, everything you do is very, very good advice. And I think you've been a serverless hero for quite some time. So just maybe we can capture, if people are interested in moving to serverless, what is your one sentence or it can be a little longer. How would you suggest people make that first step into serverless?Yan: Subscribe to this newsletter that I heard it's because something like Off-by-none. It's a really good way to just get regular newsletters about all kinds of different content.Jeremy: I did not pay you to say that. I just want to make sure that's clear.Yan: But yeah, definitely. That's one of I think one of the dangers of having Lambda being deceptively simple is that, there's still a lot of things you have to learn. There's still a lot of things you have to understand too. You can make really bad mistakes. We keep reading on the web about horror stories, but a lot of that is because of the lack of research, and I think Joe Emerson said it really well that, if you spend two weeks researching and two days doing work, you're probably going to end up better off than if you do two days of research and two weeks of work.Jeremy: Yes, I totally agree.Yan: So that you don't make all these mistakes. But in terms of actual advice, I think we share to people in the community. People like you, me, Ben Kehoe, or others, we are all very happy to help and do some research and if you're stuck, just ask us questions. We're all very keen to see a world where human productivity is not wasted on setting up servers and managing them. We'd be very happy to help you. So we'll help you get started the right way.Jeremy: Awesome. All right. Well, thank you again so much for joining me and sharing all of this serverless knowledge with everyone in the community and obviously the things that you continuously do to help people learn and educate people on serverless. If people want to find out more about you, how would they do that?Yan: They can go to theburningmonk.com or follow me on Twitter as @theburningmonk.Jeremy: And you've got a bunch of courses and open-source projects that you work on. Those are all available on theburningmonk.com.Yan: Yeah, yeah. A bunch of courses you can find under the courses heading. There's also a bunch of in-person workshops I'm doing this year and also just lots and lots of blog posts.Jeremy: Awesome. All right, well, we will get all that into the show notes. Thanks again.Yan: Thank you. Thanks for having me.


27 Jan 2020

Episode artwork

#4 - Yan Cui Serverless Hero

Talking Serverless

In episode #4 of the Talking Serverless podcast Ryan Jones talks to Yan Cui an AWS Serverless Hero and the author of Production-Ready Serverless about ReInvent:2019 and other trends developing in the serverless ecosystem. You can find more from Yan Cui on his website. Twitter--- Send in a voice message: https://anchor.fm/talking-serverless/message


1 Jan 2020

Episode artwork

Yan Cui talks Serverless | Enginears Podcast


In this episode, we’re joined by AWS Serverless Hero, Yan Cui.  If you're keen to share your story, please reach out to us!  Prefer to watch your podcasts? Check us out on YouTube: https://bit.ly/3dCMPRb You can also find us on Twitter at: http://bit.ly/36cTJYr Host: Elliot Kipling: https://bit.ly/3i1KIYM Editor: Kane Hunter: https://bit.ly/2Vij5kD 


4 Dec 2019

Episode artwork

The Business Value of Serverless with Yan Cui

Real World DevOps

About the GuestYan is an experienced engineer who has run production workload at scale in AWS for nearly 10 years. He has been an architect and principal engineer with a variety of industries ranging from banking, e-commerce, sports streaming to mobile gaming. He has worked extensively with AWS Lambda in production, and has been helping various UK clients adopt AWS and serverless as an independent consultant.He is an AWS serverless Hero and a regular speaker at user groups and conferences internationally, and he is also the author of Production-Ready serverless.Guest Links Yan’s blog Yan’s video course: Production Ready Serverless Find Yan on Twitter (@theburningmonk) Subscribe to Yan’s newsletter Centralised logging for AWS Lambda TranscriptMike Julian: Running infrastructure at scale is hard, it's messy, it's complicated, and it has a tendency to go sideways in the middle of the night. Rather than talk about the idealized versions of things, we're going to talk about the rough edges. We're going to talk about what it's really like running infrastructure at scale. Welcome to the Real World DevOps podcast. I'm your host, Mike Julian, editor and analyst for Monitoring Weekly and author of O’Reilly's Practical Monitoring.Mike Julian: This episode is sponsored by the lovely folks at InfluxData. If you're listening to this podcast, you're probably also interested in better monitoring tools — and that's where Influx comes in. Personally, I'm a huge fan of their products, and I often recommend them to my own clients. You're probably familiar with their time series database, InfluxDB, but you may not be as familiar with their other tools. Telegraph for metrics collection from systems, coronagraph for visualization and capacitor for real-time streaming. All of this is available as open source, and they also have a hosted commercial version too. You can check all of this out at influxdata.com.Mike Julian: Hi folks, I'm here with Yan Cui, an independent consultant who helps companies adopt serverless technologies. Welcome to the show, Yan.Yan Cui: Hi Mike, it's good to be here.Mike Julian: So tell me what do you do? You're and independent consultant helping companies with serverless. What does that mean?Yan Cui: So I actually started using serverless quite a few years back, pretty much as soon as AWS announced it, I started playing around with it and the last couple of years I've done quite a lot of work building serverless applications and production. And I've also been really active in just writing about things I've learned along the way, so as part of that, a lot of people have been asking me questions because they saw my blog and talk about some problems that they've been struggling with, and asked me, "Hey can you come help me with this? I got some questions." So as part of the doing that, I like to help people, first of all and then just part of doing that is something that's been happening more and more often, so in the last couple months I have started to work as an independent consultant, helping companies who are looking at docking serverless or maybe moving to serverless for new projects and want to have some guidance in terms of things they should be thinking about and maybe have some architectural reviews on a regular basis. So for things like that, I've been helping with a number of companies, both in terms of workshops but also regular architectural reviews. And at the same time, I also work part-time at a company called The Zone, which is a sports streaming platform and we also use the serverless and is contained very heavily there as well.Mike Julian: Okay, so why don't we back up like several steps. What the hell is serverless? Just to make sure that we're all talking about the same thing. What are we talking about?Yan Cui: Yeah that's a good question, and I guess a lot of people has been asking the same question as well because now they say you see, pretty much everyone is throwing the serverless label at their product and services. And just going by popular definition out there based on what I see in the talks and blog posts, I guess in terms of my social media circle, I guess by the most popular definition, serverless is pretty much any technology where you don't pay for it when you are not using it because paying for OpTime is a very serverful way of thinking and planning, and two is, you don't have to worry about managing and patching servers because installing demons or Asians or any form of subsidiary or support software on it is again, definitely tied to having servers that you have to manage. And three, you don't have to worry about scaling and positioning because the systems just scale a number of underlying servers on demand. And by this definition, I think a lot of the traditional backend server's things out there like AWS S3 or Google BigQuery, they also qualify as the serverless as well.Mike Julian: Okay, so Lambda is a good example of serverless, but there's also this thing of like a function as a service and they seem to be used interchangeably sometimes. What's going on there?Yan Cui: So to me, functions as services, describes a change in terms of how we structure our applications and changing the unit of deployment and scaling to the function level that makes every application. A lot of the function and server solutions like a dual function or Lambda as you mentioned, they will also qualify as serverless, based on the definition we just talked about and generally I find that there are a lot overlap between the two concepts or paradigms between functions and service and the serverless. But I think there are some important subtleties in how they differ because you also have functions of service solutions like Kubeless or Knative that gives you the function oriented programming model and the reactive and event driven module for building applications, but then runs on your own Kubernetes cluster.Yan Cui: So if you have to manage and run your own Kubernetes cluster, then you do have to worry about scaling, and you do have to worry about patching servers, and you do have to worry about paying for op time for those servers, even when no one is running stuff on them. So the line is blurred when you consider Kubernetes as service things like Amazon’s EKS or Google GKE where they offer Kubernetes as a service or Amazon's Fargate, which lets you run containers on Amazon's fleet of machines so you don't have to worry about positioning, and managing, and scaling servers yourself.Yan Cui: At the end of the day, I think being serverless or having the right labels associated with your product is not important. It's all about delivering on business needs quickly, but having a well understood definition on those different ideas that we have, really helps us in terms of understanding the implicit assumptions we make when we talk about something. So now that everyone is talking about calling their services or products serverless, is really not helping anyone because if everything is serverless, then nothing is serverless and I really can't tell what sort of assumptions I can make when I think about your product. Mike Julian: Right, this is the problem with the buzzwords is, the more you have of them, the less it actually means and the more confused I am about what you do. So because I love talking about where things fall apart... Like serverless, it's a cool idea. I think it works really well and yet, I've seen so many companies get so enamored with it that they spend six months trying to build their application on serverless or in that model. And then a month later, they go under. I can't help but make that the tie between the two of  — you spend all your time trying to innovate on this and at the end of the day, you didn't have any time to innovate on the product. So that's an interesting failure model. But I'm sure there's others here where people are adopting serverless in the same way when we first started adopting containers. Like, "Hey, I just deployed a container and works on my machine, have fun." So when is serverless not a good idea? What are the pitfalls we're running into? What are people not thinking about?Yan Cui: I think one of the problems we see all the time ... You mentioned when something's a hype, a lot of the adoptions happen because there's a lot of hype behind the technology and there's a lack of understanding of, this is the requirement we have and the technical constraints that you have and you go straight into it. I think this happens all the time and that's why we have the whole hype cycle to go with it. I think when you are a newcomer to a new paradigm and it's so easy to become infatuated by what this product can do and when you see the whole world as a hammer, you start looking for nails everywhere and this happens when we discover NoSQL. All of a sudden, everything has to be done with NoSQL. The MongoDB and Redis which is everywhere to solve every possible database problem, often again with disastrous results because again, people are not thinking about the constraints and the business leads they actually have and focus too much on the tech. If anything, I think with serverless, we have this great opportunity to think more about how do we deliver business value quickly, rather than thinking about technology itself. But as engineers, as technology people ourselves, you can see how easy it is to fall into that trap and I think there's a couple of used cases where serverless is just not very good in general right now. One of them is when you require consistent and very high performance.Yan Cui: So quite a lot has been made about cold starts which is something that is relatively new to serverless, well to a lot of people using serverless but again, it's not something that's new entirely. For a very long time, we've had a deal with long garbage collection pauses or server being overloaded because low is not evenly distributed, but with serverless, that becomes something that's systematic because every time a new container is spawned to one of your functions, you get this spike in latency. For some applications, that is not acceptable because maybe you are building a realtime game for example, where latency has to be consistent and have to be very very fast. You are talking about, say a multiplayer game, leaving a nine percentile latency to be below 100 milliseconds, that's not just something that you can guarantee with Lambda or any serverless platform today.Mike Julian: I worked with a company a while back that was building a realtime engine and that was a hell of a problem. So we were building everything on bare metal and VMware, and then had this really nice orchestration layer running on top of a puppet. And this is a hell of a problem because as load comes up, we're automatically scaling the stuff out, except as we're adding the new nodes, latency is spiking because we're trying to move traffic over to something that's not ready for it.Yan Cui: Yes, and with serverless, you don't have this luxury of say, let the server warm up first and then you give it some time before you actually put it into active use. Literally you can respond on the first request that they don't have a spare server running around to handle. So you always have cold start, so you can't just say, "Okay I'm gonna give this server five minutes to get warmed up first." Maybe it's JVM that takes up your warmup time so that you can feel that you're low balanced and the rest of the system to take into account the time it needs to warm up before you put into active service. With serverless you can’t do that, so where you do need consistent high performance, serverless is a really bad fit right now. I think you just touched on something else there as well, the fact that you need to have a persistent connection to a server, so there's some kind of logical notion of a server.Yan Cui: That's again, something that serverless is not a good fit for. If you want, say, a persistent connection in order to do realtime push notifications to connect the devices, or to implement subscription features in the GraphQL for example. In those cases, you also constraint by the fact that functions can only ... Run the occasion for a function can run for only certain amount of time. I think that's a good constraint. It tells you that there's certain used cases that are a really good fit for functions and service, but there are whole other cases that you just shouldn't even think about doing it. There are other ways you can get around it, but by the time you do all of that, you really have to ask yourself, "Am I doing the right thing here?"Mike Julian: Right.Yan Cui: And I think another interesting case is that, and this is again something that I find often made out of proportion is in terms of the cost. Sure Lambda is cheap because you don't pay for it when it's not running, but when you have even a medium amount of load, you might find that you might pay more for API Gateway where Lambda compared to if you just run a web server yourself. Now that's true, but one of the things that you don't think about and this most people don't think about enough is, the personnel cost, the amount of skill set you need to run your own cluster, to be able to look after your Kubernetes cluster, to do all these other things associated with having a server, that often is all that makes it more expensive than whatever premium you pay for AWS to run in your functions.Yan Cui: However, if you are talking about a system that has got, I don't know, maybe tens of thousands to request per second, consistently all the way throughout a day, then those premiums on individual invitations can start to strike up really really quickly. And I had a chat with some of the guys at Netflix a while back and they mentioned that they did a precalculation that if everything on Netflix runs on Lambda today, it will cost them something like eight times more and therefore if you are running at Netflix scale, that is a lot of money, way more than the amount of money you will pay to hire the best team in the world to look after your infrastructure. So if you are at that level scale and the cost is out to wreck out, then maybe it's also time to think about maybe moving your load into a more traditional containerized or em-based setup where you can get a lot more out of your server and do a lot more of the performance organization there, than to run them in Lambda.Yan Cui: And I think the final use case where, Lambda is probably not that good a fit or serverless is not that good a fit today is that, even though you get a good baseline of redundancy built in, so you get Multi-AZ out of the box and you can also build multi-region active APIs relatively easily; but because we are relying on the platform to do a lot more, and the platform service is essentially a black box to us, there are also cases where some of the built-in redundancy might not be enough. For example, if I'm processing events in real time with Kinesis and Lambda, the state of the polar is a black box, it's something that I can't access. So if I want to build a multi region set up whereby if the one region starts to fail, I can move the stream processing to a different region and turn it on. So have active passive set up, then I need to access the internal state for the poller which is not something that I can do, or I have to use some whole lot of infrastructure around it to be able to simulate that.Yan Cui: And again, by the time I invest all the effort and do all of that, maybe I should just start with something else to begin with. Again, those are some of the constraints that I've had to think about when I decide whether or not Lambda or serverless is a good fit for the problem that I'm trying to solve. As much as I love serverless, again, I don't think it's about the technology. It's about finding ways that can deliver the business needs you have, so whatever you choose, you have to meet the business needs first and foremost, and then anything that can let you move faster, you should go with that.Mike Julian: So all this reminds me of an image floated around Twitter a while back, that people dubbed, “Docker Cliff.” And the idea was that you had Docker at the very bottom of Dev and Prod, but to get something from Dev, like when I'm developing Docker on my laptop, to actually put it in production, takes way more than just a container. How do you do the orchestration? How do you do the scheduling? How are you managing network? What are you doing about deployment, monitoring, supervision, security and all this other stuff on top of it that people weren't really thinking about. And so for developers, Docker was fantastic. Like oh, hey, everything is great. It's a really nice self-contained deployable thing except it's not really that deployable. And I'm kind of seeing that serverless is much the same way of, we threw out a bunch of Lambda functions, like this is great. And immediately the next question is, “How do I know they're working? How do I know when they're not working? What's going on with them?” CloudWatch Logs is absolutely awful, so trying to understand what it’s doing through there is just super painful and the deployment model is kind of janky right now. How I've been deploying them is just a shell script wrapped around the aws-cli. I'm sure there's better ways to do it, so are there other stuff like this? Are there other things that we're not really thinking about and what do we do about those?Yan Cui: Yeah absolutely. The funny thing is that a lot of the problems that you talk about are things I hear from other clients or from the people from the community all the time, in terms of how do I do deployment, and how do I do basic observability stuff and the thing is that there are solutions out there that do various different degrees and I think you find that as the case with a lot of AWS services, that they cover the basic use and needs. CloudWatch Logs being a perfect example for that, but it does sit very crudely.Mike Julian: Right, it's like an MVP of a logging system.Yan Cui: Yes.Mike Julian: Every CloudWatch team, it's true.Yan Cui: And the same goes to, I guess, CloudWatch itself as well, but the good thing is that at least you don't have to worry about having to install those agents and whatever to ship your logs to, your CloudWatch Logs. So CloudWatch Logs becomes a good staging place for your logs and gather them and then from there, you can actually ship them to somewhere else. Maybe a ELK Stack, or maybe one of the main services like [inaudible 00:18:48] Logglyor Splunk or something else. So the paradigm of doing that is actually pretty straightforward. I've got two blog posts which I guess we can link to ...Mike Julian: Yeah we'll throw those in the show notes.Yan Cui: ... In the show notes. One other thing, which I think is quite important is security. Again, as developers, we are just not used to thinking about security and I see a lot of organizations try to tackle this security problem with this hammer called VPC. As if having security is gonna solve all of your problems and most of VPC ... In fact, every single VPC I've seen in production, none of them do egress filtering, so that if anyone is able to compromise your network security then you find yourself in this fully trusted environment where services talk to each other because with no authentication because you just assume it's trusted because you're inside of VPC now, but then we've seen several times how easy it is to compromise the whole ecosystem by attacking the dependencies everyone [has]. I think it was last year when a researcher managed to compromise something like 14% of all NPM packages which accounts for something like a quarter of the monthly downloads of NPM, including-Mike Julian: Well that's gonna make me sleep well.Yan Cui: So imagine if someone just compromised one of your dependencies and put a few lines of code there to scan your environment variables and then send it to their own backend to harvest all these different AWS credentials and see whether or not you can do some funky stuff or to be commanding with them. And that is not something that you can really protect by putting VPC in front of things. And yet, we see people try to take this huge hammer and apply into serverless or the same, even though when it comes to Lambda, you pay a massive price for using VPCs in terms of how much cold start you experience. My experience tells me that having a Lambda function running inside a VPC can add as much as 10-seconds to your cold start time, which basically rules out any use of facing APIs you have. But with Lambda, you can actually control your permissions down to the function level and that's again something that I see people struggle with because we don't like to think about, oh this is a IAM permissions and stuff. It's difficult, it's laborious.Mike Julian: Well you know, I think the real problem is that no one knows how IAM actually works.Yan Cui: To be fair though, I guess I'm probably a bad example because I've been using AWS for such a long time and I'm used to the mechanics of IAM and writing the permissions and the policies, but yes, it is much more complicated than people-Mike Julian: It is a little esoteric.Yan Cui: Yes, definitely. And I have seen some tools now coming onto the market which I think PureSec is one of them and a few other ones are all looking at, how do we automate this process to both identify what your function needs by doing a static analysis on your code to see how you're interacting with AWS SDK to see, oh, your function talks to this table and when you deploy or doing a CICD pipeline, you notice that, hey, your function doesn't have the right permissions, it's overly permissive. Because again, a lot of people are using just star. Email function access everything, which also means now your function is compromised. The attacker can get your credentials and do everything with that sort of temporary credentials you have. So some of these tools is going to automate whatever pain that we experience as developers in terms of figuring out what permissions our function actually needs and then trying to automatically generate those templates that we can just put into our different framework. And you talked about a deployment framework being [clunky right now. There are quite a lot of different deployment frameworks that takes care of a lot of the sort of plumbing and complexity under the hood. I don't know if you ever tried to provision an API gateway instance that are using CloudFormation or Terraform, it's horrendous.Mike Julian: It's not exactly simple.Yan Cui: It's so, so complicated because the way resources are organized in API gateway. But with something like the serverless framework or AWS SAM or a number of other frameworks out there, I can just write a human readable URL in one line that translates to I don't know, maybe a 100 lines of a CloudFormation template code.Mike Julian: That's awful.Yan Cui: This is just not stuff that I wanna deal with, so there are frameworks out there that ease a lot of burdens with deployment and similar things. On the visibility side of things as well, there's also quite a lot of companies that are focusing on tackling that side of the equation in terms of giving you better choice ability. Because one of the things we find with serverless, is that people are now building more and more event-driven architectures because it's so easy to do them nowadays.Mike Julian: Right.Yan Cui: And part of the problem with that is, they are a lot harder to trace, compared to direct API codes. With API codes, I can easily just pass along some correlation ID along the headers and then a lot of the existing tools like Amazon X-Ray can just kick in and integrate with API Gateway and Lambda already out of the box, but as soon as my event goes over asynchronous event sources like SNS, Kinese or SQS, then I lose a trace entirely because they don’t support this asynchronous  event sources. But there are companies like Epsagon who are now looking at that problem specifically and trying to understand how the whole, how data flows through the entirety of the system, whether or not it's synchronized through APIs, or whether or not it's asynchronous to the event streams or task queues or SNS topics that you have. And there are also companies that are focusing on the cost side of things, understanding the cost of user transactions that spends across this massive web of different functions, loosely coupled together through different event sources, CloudZero being one of those. I guess the foremost, companies are focusing on the cost side of the cost story of the serverless architectures. So there are quite a lot of interesting stops that are focusing on various different aspects of the problems that we've just described so far. And I think definitely the next six to twelve months, we're gonna see more and more innovation in this space, even beyond what all the things that Amazon's already doing under the hood.Mike Julian: Yeah that sounds like it will be awesome. This whole area still feels pretty immature to me. I know there's people using in production. There's also people that were using Mongo in production and it was dropping data like crazy every day. So more power to them if they don't like data. But I like stable things. So it sounds like serverless, it's still maturing. It is ready, but we're still kinda working some of the kinks out? That would be a fair characterization?Yan Cui: I think that's a fair characterization in terms of tooling space because a lot things are provided by the platform and as I mentioned before, Amazon is good at meeting the basic needs that you have. So you can probably get by with a lot of the tools out of the box, but that also I guess just slows down some of the self-commercial tooling support it comes with, something like containers comes with Kubernetes because again, you only get so much out of the box so that's a huge opportunity for vendors to jump in very very quickly, but at the same time, I think those innovations are happening a lot faster than people realize. Maybe one of the problems is just in terms of the education, getting the information about all the different tools that's coming into the space and make people aware of them.Mike Julian: That's really interesting, and what I think a lot of people forget is exactly how old Docker is because Docker was kind of in the same position of serverless, where it was really cool but it was still pretty immature. And thinking about when these things came out, now that we're seeing Kubernetes which is maturing that ecosystem further, that is actually in production. We know the patterns, and we know how all that stuff is being deployed, we know how to manage it, we know the security. It is pretty mature, but how long did it actually take to get there? And looking at it, you have Docker, its initial release was in 2013. That's like five years ago, which has blown my mind and Kubernetes initial release was in 2014, four years ago. But it's only really been in the past year or two that Kubernetes has been what we'd call mature. And now we're starting to see this massive uptick of abstraction layers on top of Docker in the form of Kube. At some point, I think we're gonna see that with serverless, where it's not just like, oh we're deploying this Lambda function and calling it a day. I think we're gonna see a lot more ... Tooling a lot more abstraction that brings it all together and makes it so much easier to deal with, especially like at scale.Yan Cui: Yeah I absolutely agree and just in terms of the dates you just mentioned, the first initial announcement on Lambda was 2014, so in terms of age, it's not that much younger compared to Docker and the Kubernetes.Mike Julian: Wow.Yan Cui: Where it has differed, is that it's a brand new paradigm, whereas with containers and with Kubernetes, it's a lot easier for you to lift and shift existing workloads without having to massively restructure your application to be intermative for this paradigm. With Lambda, and with serverless, there is that requirement that in order to be idiomatic, there's a lot of restructuring and rethinking you need to do because with them, it's a mind-shift change. And that takes a lot longer than just technology change.Mike Julian: Right, yeah. We're talking about something completely new here. So it's not like, oh we'll just go implement Lambda over night and we'll call it a day. We'll just move our whole application over. It's not like when we start putting things in containers. We could actually put a thing in a container, but really all we're doing by lifting and shifting was, moving from one server to another except now it's a smaller server.Yan Cui: Yes.Mike Julian: We had the idea of the fat container where you had absolutely everything in a container. That is a bad idea, it's a dumb pattern. And it's going the same way with serverless, I think. You can't just lift and shift. It is a brand new architectural pattern. It requires a lot of serious thought.Yan Cui: Yeah, and I think one of the pitfalls I just see in terms of the serverless adoptions sometimes is that, we are so embraced in this whole movement into a new paradigm that sometimes we just forsake all the things we've learned in the past, even though a lot of principles still very much apply. And in fact, a lot of things I've been writing about is basically how do we take previous principles, but apply them, adjust them and make them work in this new paradigm? Because, the practices and patterns may have to change because some things just doesn't work anymore. A lot of principles still very much apply. Why do we do structure login? Why do we do sampling in production? All those things, the principles still very much apply when it comes to serverless. It's just, how we get there is different. And I think that is one of the key things I had to learn the last couple of years is that, a lot of things that we learn in the past, just with databases, a lot of things we learn about databases are still very much there to stay even if we don't need a specific skill set that DBAs provide for us in the new world of NoSQL databases. When it comes to serverless, I guess a leap from understanding and looking at practices, to understand the principles behind them, why do we do it, how can we apply those principles, that's super important when it comes to making a successful adoption of serverless in your organization.Mike Julian: That's an absolutely fascinating perspective because I completely agree. What I absolutely love about it is, the principles of site reliability haven't actually changed. The principles of how we run and manage systems, has it really changed a whole lot in the past 10 years? Which is fantastic. That's how it should be. We should always be looking for true principles. It's stuff that kind of pillars of how we behave and how we look at what we work on. How we do it, changes all the time and it absolutely should, but the principles shouldn't change that much. So that's interesting of trying to apply the ... The principles that we already know to be true. The practices that we know, work. And how do we apply it to a new paradigm? And sure, maybe some of them aren't going to apply very well and we maybe have to create a new one, which I'm sure there will be coming out of this. But, we don't have to start from scratch.Yan Cui: No, what's that saying again? Those who don't know the history are doomed to repeat them.Mike Julian: Right, exactly. We've talked a lot about the failures and the challenges, and you keep mentioning this idea, the business case for serverless. So sell me on it. I want to deploy serverless in my company. I'm just an engineer, but I really like it, so I wanna move everything to it. I wanna do a new application in it. What should I be thinking about? How do I come up with this business case?Yan Cui: I think the most important question there is, what does the business care about? And I think pretty much every business I know of, cares about delivery and speed. As a business, you want to deliver the best possible user experience and you want to build the right features that your users actually want, but to do that, you need to be able to hit the market quickly, and inexpensively, so that you can also then iterate on those ideas and that allows you to tell the good ideas from the bad ones and then you can double down on a good ideas and make them really great. And the more you have to do it, the more your engineering team have to do it themselves, than by definition, the slower you gonna be able to move. And that's why businesses should care about serverless because it frees the engineering teams from having to worry about a whole load of concerns. They need to know how the applications are hosted and let the real experts, the people that work for AWS, to worry about those undifferentiated heavy lifting. And then that frees the brainpower that you actually have, which by the way are super expensive on solving the problems that your users actually care about. No user cares about whether or not your application runs on containers or VMs or serverless, but they do care about when you gonna deliver them and they do care about building the right features. And that again, that needs you to optimize for a time to market and also, it will iterate quickly. A lot of people talk about vendor locking as if Amazon's gonna one day just worry about Amazon holding the key to your kingdom, but I think the real-Mike Julian: That's the last thing I'm worried about.Yan Cui: Yeah exactly, I think the biggest problem we should worry about is a competitor who can iterate faster than you, locking you out of the market altogether.Mike Julian: Right.Yan Cui: Yeah so I think that's why they should really really care about serverless.Mike Julian: I agree with that. That sounds great. The biggest thing that I see with technology is, with engineers and their engineering architectural decisions, it seems that a lot of decisions are based essentially on resume-driven development. I've met a lot of engineers where I built this new application in Go because I wanted to learn Go, and I'm like, that's cool, what does the business have to say about that? And it's like well, "I convinced my boss to use Go." I'm like, "No you did." Like your entire shop's in PHP, you basically just said PHP is shit. That was your business case. Instead like, yes we should be looking at this from the perspective of how quickly can I get this new product to market? How quickly can I ship this feature? And yeah there might be some scenarios where switching a language or switching a framework would be useful, but I agree with you that we really should be focused significantly more on time to market and time to value. We're here to help our businesses make money, or in my case, help my business make money. But for me, I have an application that I'm writing in PHP right now. It's PHP and MySQL and it's gonna be a core facet of my own company. And most engineers would say I'm crazy for writing PHP, but the entire point is that I don't have time to deck around. I need to have this out in the market.Yan Cui: Yeah absolutely, totally agree. And those kind of conversations, I've had quite a few of them in the past myself, and also I've heard a lot of similar arguments in terms of, oh why should we use, for example, functional programming. And one office already wrote the function of programming community for quite a long time and are still a big fan of function and programming, but not for the reason that it makes your code size more readable, but again, it's about moving up the abstraction ladder so that I have to do less and it's about getting that leverage to be able to do more with less and I think that's the argument that we should be making more, I suppose to, how I like to read my codes.Mike Julian: Right, let’s take this from two different perspectives. For the people that are brand new to serverless, what can they do this week, or today, to learn more about it? And for the people that already have serverless in their infrastructure, what can they do this week to improve their situation?Yan Cui: I think learning by doing is always the best way to get to grips on something. So if you are just starting, definitely with serverless, it's so easy to get started and play around with something, and when you're done, just delete everything with confirmation, you sync or button click, or if you're using the right tools, it scans a single command. So definitely go build something. If you got questions that you don't know how the platform behave, then build a proof of concept, try out yourself. It's super, super simple nowadays. That's how I've learnt a lot of things. I've learnt it now through serverless, is just by running experiments. Come up with the question, coming out with the hypothesis on how I expect things to do it, or how the platform to behave, do a proof of concept to answer those questions and then again, I like to write about things so that I have a record for it afterwards but also I can share with other people, things that I've learned and afterwards as well.Yan Cui: And if you already started, and you want to take your game to the next level, don't wanna be boasting myself, but do check in my blog, I have shared a lot of the things that I've learnt about running serverless in production and solved problems you run into, and addressing a lot of the observability concerns, and I also have a video course with Manning as well. Feel free to check out where we actually build something from scratch and apply a lot of things that I've been talking about for the last year and a half, two years, in terms of how do you do auto basic observability things, how to think about security, VPCs and performance and so on. So all of that will be available on the podcast episode notes. Yeah, and also just go out there and talk to other people and learn from them. There's a lot of very knowledgeable in this space already. People like Ben Kehoe from iRobot, people like Paul Johnston and Jeremy Daily and there are quite a lot of people who have been very active in sharing their knowledge as well and their experiences. Definitely, go out there, find other people with who are doing this, and try and learn from them.Mike Julian: That's awesome. So thank you so much for joining us. Where can people find more about you and your work?Yan Cui: You can find me on theburningmonk.com and that's my blog, I try to write actively and you can also find me on Twitter as well. I try to share new things that I find interesting, anything I learn and whenever I write something also, I publish there as well. And if you don't wanna miss anything, I also have a newsletter you can subscribe to on my blog. And so I've tried to write up regular summaries, updates for things I've been doing. And also, I'm available for doing some consultancy work if you need some help in your organization. Or to get started, but also to tackle specific problems that you have with serverless as well.Mike Julian: Wonderful. Well thank you so much for joining us. And on that note, thanks for listening to the Real World DevOps podcast. If you wanna stay up to date on the latest episodes, you can find us at realworlddevops.com. And on iTunes, Google Play or wherever you get your podcast. I'll see you in the next episode.Yan Cui: See you guys.


14 Feb 2019