Most of my games are for two players, and most of those are asymmetric: the Irish Player versus the Viking Player (cf. Kingdom of Dyflin) for example, or Patriot versus Crown (Supply Lines). Because each side has certain advantages and disadvantages - something which doesn't feature in a game with symmetrical player positions - much of the playtesting is preoccupied with ensuring these advantages and disadvantages achieve the proper balance.
A conventional and commonplace conception of balance would figure that each side in a given match should have an equal chance of winning assuming players of equal skill. This plays into what I sometimes refer to as a binary view of game balance - that idea that games are either balanced or they're not, yes or no, good or bad. 50/50 is balanced, but 51/49 is imbalanced.
I generally don't have much use for this definition because I prefer games where the balance is dynamic. In many of my designs, one player is going to start with some slight advantage that they want to turn into a larger one, while their opponent begins on the back foot but is looking for a way to turn the tables. Maybe the odds of winning are 52/48 or even 56/44 in red's favor at the start of the game, but over the course of the game they might shift back and forth - 49/51, 60/40, 40/60. Feedback loops are used to harden these advantages over time - it's going to be easier to turn a 70/30 into an 80/20 than it was to turn a 60/40 into a 70/30. The system becomes less resilient, more brittle. Most of my balance tweaks are about applying this concept in reverse - about making the system more resilient and less brittle early on, when the odds are going to be somewhat closer to even. It's still incumbent upon the player on the back foot to recognize the danger they're in and do something about it, but failure to do so on turn one (when the disparity is small) isn't going to be quite as catastrophic as failing to do so later in the game, when it becomes a fatal error.
Another metric that I find more useful is the win rate. Yes, a given match might favor one side or the other depending on player actions and the fuzziness of the system, but in a large enough series of matches, you will ideally see each side win an equal number of times. This isn't foolproof, of course, and like everything in game design, requires your judgment.
Let's say, for example, that we're going to play a game called Coin Flip. This is a game for two players, the Heads Player and the Tails Player, and how it works is you flip a coin, and if comes up "heads", Heads wins, and if comes up "tails", Tails wins. If we played this game a hundred times, one would expect, statistically, that Heads would win fifty games and Tails would win fifty games. If Heads wins ninety games out of a hundred, then probably there's something fishy about that coin.
But what if Heads wins sixty and Tails forty? That's a little trickier. Something could be off, or it could just be the way those particular coin flips panned out (i.e., Heads "played better"). If you were unlucky enough in school to have a math teacher that was the right mixture of apathetic and sadistic, you may even have spent the better part of an hour flipping a coin a hundred times and jotting down which side won, and almost none of those actually came out fifty-fifty, simply because a hundred is actually a pretty small sample size.
And all this is assuming a very simple and symmetrical "game", where both sides do have a static fifty-fifty chance of winning. Once the game is asymmetrical, and once the balance becomes dynamic and shifting, it becomes a lot trickier. Michigan might win 70 games of The Toledo War and Ohio 30, but in how many of those games did Michigan begin with an advantage, and how big was that advantage? What if Ohio had done X instead of Y? It being a card game, there's also the question of what if they had drawn this card instead of that one? Was the game a blowout, a foregone conclusion, or did Michigan win the thing from behind? More importantly, if we played another hundred games, would the numbers look any different?
Sussing this out requires judgment and experience, but most of all, patience. It's very easy to start to panic when the data looks lopsided. When I started playtesting The Toledo War, Ohio won six of the first thirteen games, and Michigan seven. Then, Ohio won twelve games in a row. Eighteen out of twenty-five! Twelve in a row! That last part was especially tough - each successive victory made it feel like Ohio's wins were inevitable, like the thing was broken. But then Michigan won several games in a row, and over the course of the first hundred games, Ohio was only slightly ahead. A hundred more games saw Michigan well ahead, and a hundred after that saw something closer to parity. (The beautiful thing about a ten minute card game is you can test it pretty rapidly; there were nearly five hundred playtest matches for this little freebie game.) If I had given in to my panic after the first twenty-five matches, when Ohio was on its hot streak, I would have likely done damage to the design in my attempts to fix a problem that didn't exist.Throughout the testing process, streaks were pretty common, and I think that's just a feature of testing an asymmetric two-player game; it's going to be very rare indeed that the sides will alternate wins. Heck, that's a feature of playing a two-player game out in the wild. I often see folks talking about how a game is imbalanced because they played it three times, and the red side won all three games. Three games in a row is barely even a streak, but it's enough to disincline some players from trying it again. Which goes to show you that as important as balance is to the designer's craft - and it is crucial - it's often the perception of balance that matters once the game gets out into the world.